nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Louis-√Čtienne Dorval <ledor...@gmail.com>
Subject Re: Wait from multiple inputs before ending the flow
Date Sat, 12 Dec 2015 19:19:10 GMT
Juan,

Very good advice!
I've updated my loop to add a processor that will run each X seconds when a
retry is needed, that way it won't block the other request :


InvokeHttp --(failure)--> UpdateAttribute (increment a counter) -->
RouteAttribute (if counter lower than X will retry) --(if retriable)-->
UpdateAttribute (does nothing but scheduled to run each X sec) -->
InvokeHttp (same as the first)


Thank you
Louis-Etienne

On 12 December 2015 at 14:07, Juan Sequeiros <hellojuan@gmail.com> wrote:

> Good afternoon,
>
> All processors have a scheduling tab or run every X time. I would set it
> at your last invokeHttp.
>
> If time is important look at the prioritizer settings too. So that you can
> send through as example a more important flowFile.
> On Dec 12, 2015 13:48, "Louis-√Čtienne Dorval" <ledor473@gmail.com> wrote:
>
>> Hi again,
>>
>>
>> The MergeContent works perfectly for my case! The flow I've described in
>> the previous email changed a bit, but still it's working as expected.
>>
>> The only problem, now NiFi is much faster than the existing system
>> running in parallel (which is really good). I've done a "retry loop"
>> describe below but still it's too fast :
>>
>> InvokeHttp -- (failure) --> UpdateAttribute (increment a counter) -->
>> RouteAtttribute (if lower than X will retry) --> InvokeHttp
>>
>>
>> Question: Is there something that already exist which could "sleep" a
>> FlowFile for X seconds before continuing?
>>
>>
>> Best regards and great job on the version 0.4.0, the Syslog feature is
>> much appreciated!
>> Louis-Etienne
>>
>> PS: Let me know if I should have started a new email thread with that
>> question.
>>
>>
>> On 6 December 2015 at 23:30, Joe Witt <joe.witt@gmail.com> wrote:
>>
>>> Louis-Etienne,
>>>
>>> NIFI-190 isn't scheduled on anything as of yet.  We had some design
>>> questions/ideas and your example informs it even further.
>>>
>>> I think the custom proc method you mention will work out well.
>>> Ultimately there will need to be one anyway to deal with the logic of
>>> merging this particular format+schema.
>>>
>>> Thanks
>>> Joe
>>>
>>> On Sun, Dec 6, 2015 at 11:28 PM,  <ledor473@gmail.com> wrote:
>>> > Joe,
>>> >
>>> > Thanks for the prompt reply.
>>> > About the merge, both message will be JSON and I need some specific
>>> part from both.
>>> > I'll recheck the doc to see what my options are, but I think that
>>> using FlowFile Streams and a custom processor that would do the logic might
>>> be good
>>> >
>>> > About the HoldProcessor, you must talk about NIFI-190. The way you
>>> describe it seems to what I'm looking for
>>> > But in the JIRA and looking quickly at the PR it seems like I would
>>> lose the message from Topic2.
>>> >
>>> > I'll dig in the code of the PR and the MergeContent processor in order
>>> to have a better understanding.
>>> >
>>> > Was that JIRA scheduled for a specific milestone? It would probably be
>>> a great addition but maybe it require a lot of change that I dont see yet
>>> >
>>> > Regards,
>>> > Louis-Etienne
>>> >
>>> >> On 6, 2015, at 9:42 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>> >>
>>> >> Louis-Etienne,
>>> >>
>>> >> My initial thought is your idea with MergeContent is the right one.
>>> >> However, the issue there is not just the combining of the data but the
>>> >> 'what does merging truly mean in that case'.  So it is a bit undefined
>>> >> what the next step will be.  Merge the content?  If so, how?  What is
>>> >> the format and schema of the objects before the merge and after?
>>> >>
>>> >> Another member of the community had an idea for a concept of a
>>> >> HoldProcessor.  It would allow these sorts of multi-object gates to
>>> >> occur.  The same issue exists of what to do once the gate criteria is
>>> >> hit but at that point you'd have more control over it.  MergeContent
>>> >> is an already prescribed set of behaviors whereas HoldContent would
>>> >> let you choose the next gate.  We really should get on with helping
>>> >> get that contribution in.
>>> >>
>>> >> Thanks
>>> >> Joe
>>> >>
>>> >>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-√Čtienne Dorval <
>>> ledor473@gmail.com> wrote:
>>> >>> Hi everyone!
>>> >>>
>>> >>> I'm very excited to start using NiFi and I think that it will be
very
>>> >>> usefull for a some projects.
>>> >>>
>>> >>> I've been playing with it for some times and did a few basic flow,
>>> but I'm
>>> >>> having a hard time figuring how to achieve a part of my flow or
if
>>> NiFi will
>>> >>> be able to do it.
>>> >>> I'm building a flow around existing systems, so NiFi would run in
>>> parallel
>>> >>> of that and gather the output of these systems (everything is
>>> asynchronous)
>>> >>> to take actions.
>>> >>>
>>> >>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3
>>> processor
>>> >>> that does Attribute Extractions.
>>> >>> During that time the existing system will process the same message,
>>> enrich
>>> >>> the message (but also remove some usefull information) and will
>>> publish on
>>> >>> Topic2.
>>> >>> I need the message from Topic2, so I've added another GetJMSTopic
on
>>> Topic2.
>>> >>> Then I need to somehow take the FlowFile from Topic1 and from Topic2,
>>> >>> "merge" them together in order to have the attributes from both
>>> FlowFiles.
>>> >>> After that I will probably need to use the GetMongo to access some
>>> >>> information. This will probably create a new FlowFile that I need
to
>>> "merge"
>>> >>> with the others.
>>> >>> Then I'll put that in HBase or something else, not sure yet.
>>> >>>
>>> >>> The part that I'm not sure how to solve is the "merge" of multiples
>>> >>> FlowFile, I hesitate between using the MergeContent processor and
the
>>> >>> DetectDuplicate:
>>> >>>
>>> >>> MergeContent seems like what I needs but the existing systems might
>>> add some
>>> >>> latency (and it will increase when there's a lot of publish on
>>> Topic1) so I
>>> >>> would need to increase the 'Maximum number of Bins'.
>>> >>> It will probably affect the performance of the system but how bad?
>>> >>> DetectDuplicate, it would feel akward to use that since it's not
>>> really a
>>> >>> duplicate, but it would be more lightweight (only keeps a hash).
But
>>> will I
>>> >>> be able to find the previous FlowFile with
>>> "original.flowfile.description" ?
>>> >>>
>>> >>>
>>> >>> Let me know if there's another option that I didn't look into.
>>> >>> Or maybe my problem is really trivial but I need to change my
>>> perspective on
>>> >>> it...
>>> >>>
>>> >>> Best regards,
>>> >>> Louis-Etienne
>>>
>>
>>

Mime
View raw message