nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Zhurakousky <ozhurakou...@hortonworks.com>
Subject Re: [DISCUSS] Run Once scheduling
Date Thu, 12 Jan 2017 18:16:48 GMT
I was just about to suggest the same. 
Run-once would be a bit counter intuitive to the flow processing as a concept. Basically think
of it this way; Flow or parts of it have only two states - RUNNING or STOPPED. In the RUNNING
state it processes the data as it arrives (every second, every minute or every day etc). Indeed
there may be a concern that the processor will do a lot of 'dry’ spins if no data is available
but fortunately NiFi allows you to limit the impact of that by configuring “yield duration’.
By default it is set to 1 sec, but for your case you may wan to set it to 1 hour or so essentially
controlling the scheduling of such processor between ‘dry’ spins.

That said and just to entertain the idea of Run Once, what do you think should be the processor
state if it did ran once? Let’s assume it did and somehow was stopped. . . then what? The
data arrived on the incoming queue, but nothing is processed until someone manually goes and
re-starts the processor. Right?
I mean from the general workflow standpoint the concern is very valid, but from flow processing
the fact that NiFi does not support it is actually more of a feature rather then lack of functionality.

Thoughts?

Cheers
Oleg

> On Jan 12, 2017, at 1:02 PM, Joe Witt <joe.witt@gmail.com> wrote:
> 
> Naz,
> 
> Why not just leave all the processes running?  If the data only
> arrives periodically that is ok, right?
> 
> Thanks
> Joe
> 
> On Thu, Jan 12, 2017 at 10:54 AM, Irizarry Jr., Nazario <naz@mitre.org> wrote:
>> On a project that I am on we have been looking at using NiFi for orchestrations that
are invoked infrequently.  For example, once a month a new data input product becomes available
and then one wants to run it through a set of processing steps that can be nicely implemented
using NiFi processors.  However, using the interval or cron scheduling for this purpose begins
to get cumbersome after a while with the need to start and manually stop these occasional
flows.
>> 
>> It would be fairly easy to add an additional scheduling option - “Run Once” for
this use case.  The behavior would be that when a processor is set to run once it automatically
stops after it has successfully processed one input.
>> 
>> What do people think?  We are willing to implement this small enhancement.
>> 
>> Cheers,
>> 
>> Naz Irizarry
>> MITRE Corp.
>> 617-893-0074
>> 
>> 
>> 
> 

Mime
View raw message