nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chirag Dewan <chirag.dewa...@yahoo.in>
Subject Re: RE: [EXT] Re: Polling Processors impact on Latency
Date Wed, 08 Nov 2017 09:55:02 GMT
 That's a great start Andy and Peter. Thank you for such precise answers. 
I will start tweaking with the parameters you mentioned and try and reach an optimum latency-resource
configuration. 
Thanks a lot for your help. 
Chirag
    On Tuesday 7 November 2017, 8:40:58 PM IST, Andy Christianson <aichrist@protonmail.com>
wrote:  
 
 Chirag,

Peter's note about bored yield duration is right on. Some additional things I'd like to point
out are:

1) You might get the lowest latency in a configuration where the processor runs continuously
(bored yield duration 0). This is because with the code executing continuously, CPU caches
should stay hot. The trade-off is wasted CPU cycles when the consumer is waiting for input
more often than processing it.

2) We do have an event driven scheduling mode. For NiFi this still appears to be experimental:

"Event driven: When this mode is selected, the Processor will be triggered to run by an event,
and that event occurs when FlowFiles enter Connections feeding this Processor. This mode is
currently considered experimental and is not supported by all Processors. When this mode is
selected, the ‘Run schedule’ option is not configurable, as the Processor is not triggered
to run periodically but as the result of an event. Additionally, this is the only mode for
which the ‘Concurrent tasks’ option can be set to 0. In this case, the number of threads
is limited only by the size of the Event-Driven Thread Pool that the administrator has configured."

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html

This mode is also implemented in MiNiFi - C++ and is done using condition variables. This
type of event-driven scheduling should put you near the lower limit of latency without sacrificing
much CPU, assuming a workload where the consumer is waiting for input more often than processing
it.

I would suggest trying out different configurations and taking measurements, as the ideal
config will depend a lot on your workload.

Regards,

Andy I.C.

Sent from ProtonMail, Swiss-based encrypted email.


>-------- Original Message --------
>Subject: RE: [EXT] Re: Polling Processors impact on Latency
>Local Time: November 7, 2017 7:29 AM
>UTC Time: November 7, 2017 12:29 PM
>From: pwicks@micron.com
>To: users@nifi.apache.org <users@nifi.apache.org>
>
>
>If you schedule the processor to run every 0 sec (the default) then in my experience you
won’t notice latency from polling at all. But I guess this depends on your expectations,
> volume, and over all Flow processing time.
>
>
>
>Yes, event driven may help, but from what I’ve read it’s more about reducing server
resource consumption than latency (could be wrong).
>
>
>
>As for a hard set limit, there is a configuration entry in nifi.properties that seems
relevant:
>
>
>
># If a component has no work to do (is "bored"), how long should we wait before checking
again for work?
>
>nifi.bored.yield.duration=10 millis
>
>
>
>Thanks,
>
>  Peter
>
>
>
>From: Chirag Dewan [mailto:chirag.dewan22@yahoo.in] 
>Sent: Tuesday, November 07, 2017 8:02 PM
>To: aperepel@gmail.com; users@nifi.apache.org
>Subject: [EXT] Re: Polling Processors impact on Latency
>
>
>Thanks Andrew for the quick response.
>
>
>I am more concerned about the processors polling for flow files on the connection between
the processors?
>
>Thanks,
>
>Chirag
>
>Sent from Yahoo Mail on Android
>
>
>>On Tue, 7 Nov 2017 at 5:24 PM, Andrew Grande
>><aperepel@gmail.com> wrote:
>>Yes, polling increases latency in some cases. But no, NiFi is not just polling. It
has all kinds of sources, and listening vs polling vs subscribing purely depends on the protocol
of that given processor.
>>
>>Hope this helps,
>> Andrew
>>
>>
>>
>>On Tue, Nov 7, 2017, 1:39 AM Chirag Dewan <chirag.dewan22@yahoo.in> wrote:
>>>Hi All,
>>>
>>>I am a layman to NiFi. I am exploring NiFi as a data flow engine to be integrated
with my Flink processing engine. A brief history of our approach : 
>>>
>>>We are trying to build a Streaming Data processing engine. We started off with
Flink as the sole core engine, which is responsible for collection(through Flink Sources)
as well as processing
>>> the data. 
>>>
>>>Soon we fumbled onto NiFi and the data flow world. 
>>>
>>>So far, my understanding is that the NiFi processors are poling processors and
not Pub-Sub processors. That makes me wonder, whats the impact of polling on latency? I know
I can configure
>>> my processors to tradeoff latency with throughput, but is there a hard set limit
on the latency I can achieve using NiFi? 
>>>
>>>As I said, I am layman as yet. Perhaps my understanding is short here. Any leads
would be much appreciated. 
>>>
>>>P.S - Not diving much into Event Driven Processors. They look like something which
might clear my thoughts. But since they are marked experimental, would be more interested
in understanding
>>> the timer driven processors.
>>>
>>>Thanks,
>>>
>>>Chirag
>>>
>>>  
Mime
View raw message