falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Sundarrajan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-1406) Effective time in Entity updates.
Date Thu, 24 Nov 2016 02:42:58 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691996#comment-15691996

Srikanth Sundarrajan commented on FALCON-1406:

Thanks [~ajayyadava], Wanted to bring the notes from the discussion back into the jira for
others to also chime in.

+*1. Motivation for the feature*+
There is a need for entity update effective in the past due to issues relating to code issues,
data schema changes that are retro effective. There is also a need for a clean way to do this
without soiling the system with newer temporary entities or other hacks. There is generally
an agreement acknowledging the utility of the feature.

+*2. Versioning of Entities*+
Would versioning of entities be a better way to handle this generically and more cleanly.
The points discussed around these were 
  * Would it make sense to track and maintain versioning for feeds and what would be the challenges
for the consumers of data/feed to depend if feed was versioned
  * If entities are versioned, would all the APIs and hence the end users will be version
aware in all the operation 
  * Would versioning solve this problem more cleanly and if so how

This is what we felt would be good answers to these questions.
  * Versioning of feed would indeed make it difficult and challenging for the consumers. The
way processes depend on the latest definition of the feed at the time of its execution seemed
the right approach (Lifecycle action execution would still benefit from versioning, more on
that later)
  * Processes on the other hand would benefit from versioning as there is code associated
with it. There are a number of ways to look at the versioning scheme. If time (loosely effective
time) were to be a equivalent of a version then the current feature does allow for a rudimentary
versioning scheme. But the fact is that the rest of the system particularly the config store
has to be version aware (regardless of the version scheme). If any of the sub services within
the Falcon system were to use the definition of the entity and build out further capabilities,
then those have to be version aware as well (for ex. SLA monitoring, alerting and likes).
While the system itself has the ability to track the version / history of the entity, it didn't
seem right to burden the users (or the APIs) to be version aware. It would be helpful to retain
the current semantics. However Definition listing, Feed instance availability, Dependency
APIs would benefit from being version aware.

+*3. Known and unknown gaps*+
  * There are many sub services particularly on the instance start/finish path that may be
broken if not handled correctly with this change
  * Scheduled feeds can have similar problems such as processes as we can choose to make an
update retroactively.

+*4. Way forward*+
  * Design document to identify the gaps relating to other affected components with the effective
time and particularly if the approach to treat entities as versioned, what changes would these
  * Identify and file associated JIRA related to these gaps and address them. 
  * As a community we can then review and ensure all known gaps are covered in the design
document and issues are tracked.

Request [~ajayyadava] to chime in with missing details of if any details are misrepresented.

> Effective time in Entity updates.
> ---------------------------------
>                 Key: FALCON-1406
>                 URL: https://issues.apache.org/jira/browse/FALCON-1406
>             Project: Falcon
>          Issue Type: New Feature
>            Reporter: sandeep samudrala
>            Assignee: sandeep samudrala
>         Attachments: FALCON-1406-initial.patch, effective_time_in_entity_updates.pdf
> Effective time with entity updates needs to be provided even with past time too. There
was effective time capability provided in the past which gives the functionality to set an
effective time for an entity with only current or future time(now + delay), which could not
solve all the issues. 
> Following are few scenarios which would require effective time to be available with time
back in past.
> a) New code being deployed for an incompatible input data set which would leave instances
with old code and new data.
> b) Bad code being pushed for which, the entity should be able to go back in time to replay(rerun)
with new code.
> c) Orchestration level changes(good/bad) would need functionality to go back in time
to start with.
> For reference: Linking all the Jiras that have been worked upon around effective time
> https://issues.apache.org/jira/browse/FALCON-374
> https://issues.apache.org/jira/browse/FALCON-297

This message was sent by Atlassian JIRA

View raw message