falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sandeep samudrala (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-1852) Optional Input for a process not truly optional
Date Wed, 15 Jun 2016 10:41:09 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331519#comment-15331519

sandeep samudrala commented on FALCON-1852:

This patch adds optional inputs to data sets to coordinator definition. Upon releasing this
version(pushing el extensions jar to oozie), the older coordinators already running would
not be having this input as data set , while the new workflow instance triggered expects this
to be a data set and there by the workflow fails with below message.
variable [optionalInput] cannot be resolved

The processes have to be touched/updated accordingly since the update got pushed for the instances
to run successfully.

> Optional Input for a process not truly optional
> -----------------------------------------------
>                 Key: FALCON-1852
>                 URL: https://issues.apache.org/jira/browse/FALCON-1852
>             Project: Falcon
>          Issue Type: Bug
>            Reporter: Pallavi Rao
>            Assignee: Pallavi Rao
>              Labels: backward-incompatible
>             Fix For: 0.10
> Currently, when a feed input is marked as optional, we do not add it to the coordinator
definition's datasets. This means we do not wait for all instances (for a given data window)
to arrive. Instead, we just resolve the paths for a data window and pass it as a parameter.
> For example:
> {noformat}
> <inputs>
>         <!-- In the workflow, the input paths will be available in a variable 'inpaths'
>         <input name="inpaths" feed="in" start="now(0,-5)" end="now(0,-1)"/>
>         <input name="in2paths" feed="in2" start="now(0,-5)" end="now(0,-1)" optional="true"/>
>     </inputs>
> {noformat}
> For a process instance 2013-01-01T00:00Z, the optional input, in2paths, will be resolved
as below:
> {noformat}
>  <property>
>     <name>in2paths</name>
>     <value>hdfs://localhost:9000/data/in2/2013/11/15/00/04,hdfs://localhost:9000/data/in2/2013/11/15/00/03,hdfs://localhost:9000/data/in2/2013/11/15/00/02,hdfs://localhost:9000/data/in2/2013/11/15/00/01,hdfs://localhost:9000/data/in2/2013/11/15/00/00</value>
>   </property>
> {noformat}
> If one of the instance of in2paths (example, hdfs://localhost:9000/data/in2/2013/11/15/00/04)
is missing, the workflow will fail anyway.
> Hence, input, in2paths is not truly optional. Only that the triggering of instance is
not gated on it.

This message was sent by Atlassian JIRA

View raw message