falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesan Ramachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-2030) Enforce time partition pattern in the data location path in feed definition
Date Wed, 15 Jun 2016 21:37:09 GMT

    [ https://issues.apache.org/jira/browse/FALCON-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332635#comment-15332635

Venkatesan Ramachandran commented on FALCON-2030:

[~ajayyadava] welcome back and thanks for the info. 

The reason is that we hit FALCON-2023 if no pattern is specified in the path. Also, for snapshot
like data (the use case you are referring to), it will be better to write that under a subfolder
-- it could be a timestamp pattern or version number (like EPOCH as a number). While accessing,
workflows can use LATEST EL to get the latest folder and consume it. 

This way, the datasets version could be tracked and maintained. Even metadata can change (append/remove/update)
although at a very slow rate. This way we can ensure inflight workflow/pipelines do not get
affected by the addition/removal/update of data.

Let me know what you think.

> Enforce time partition pattern in the data location path in feed definition 
> ----------------------------------------------------------------------------
>                 Key: FALCON-2030
>                 URL: https://issues.apache.org/jira/browse/FALCON-2030
>             Project: Falcon
>          Issue Type: Improvement
>          Components: feed
>            Reporter: Venkatesan Ramachandran
>            Assignee: Venkatesan Ramachandran
> In feed definition, data location can be specified without time series pattern like below:
>    <locations>
>         <location type="data" path="/tmp/falcon-regression/RetentionTest/testFolders/"/>
>         <location type="stats" path="/projects/falcon/clicksStats"/>
>         <location type="meta" path="/projects/falcon/clicksMetaData"/>
>     </locations>

This message was sent by Atlassian JIRA

View raw message