nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clay Teahouse <clayteaho...@gmail.com>
Subject Re: Optimizing Performance of Apache NiFi's Network Listening Processors
Date Mon, 05 Aug 2019 15:24:49 GMT
Hi Edward, Bryan
One more question regarding ListenSyslog. Is it possible to set batch size
> 1 with parse set to true? I am ingesting a very high volume of syslog
records and want to avoid flowfiles containing only one record but at the
same time, I want to be able to parse the records. Is there a way around
this?

thanks
Clay

On Fri, Aug 2, 2019 at 8:50 AM Edward Armes <edward.armes@gmail.com> wrote:

> HI Clay,
>
> So as Bryan has said the actual connection is managed by a selector and
> all this does is goes through each connection and once that connection has
> data to receive it the selector then hands that over to a thread in the TCP
> receiving thread pool which does then some basic TCP processing and puts it
> into a buffer for an instance of associated ListenSyslog processor to
> processes, when the framework executes an instance of that processor.
>
> Just so you're aware while setting the maximum number of connections does
> create a thread pool of 4,000 threads. In reality these threads don't
> really exist until one is created by the selector to run on the pool. So in
> short unless a single Nifi server gets 4,000 syslog messages in a very
> short space time (< 1 micro-second) I can't see it being an issue.
>
> Edward
>
> On Fri, Aug 2, 2019 at 2:06 PM Bryan Bende <bbende@gmail.com> wrote:
>
>> The actual connections themselves are managed with a selector, so if
>> all the connections are idle there should only be one thread for the
>> socket.
>>
>> As soon as a connection has something available to read then a thread
>> is spawned to start reading the connection until either no matter is
>> available, or it is closed.
>>
>> On Fri, Aug 2, 2019 at 7:18 AM Clay Teahouse <clayteahouse@gmail.com>
>> wrote:
>> >
>> > Hello Edward,
>> > So, if have of to listen to 32,000 tcp connections and I have only 80
>> cores, and I configure each ListenSyslog instance for 4,000 connections,
>> doesn't each spawn 4,000 threads behind the scene? The tcp connections will
>> be idle most of the time.
>> >
>> > thanks
>> > Clay
>> >
>> >
>> > On Fri, Aug 2, 2019 at 6:10 AM Edward Armes <edward.armes@gmail.com>
>> wrote:
>> >>
>> >> Hi Clay,
>> >>
>> >> Because Nifi underneath uses a thread pool for it's own threading
>> underneath, and each instance processor runs does so in it's own thread, I
>> don't see any reason why not. One thing to note that the way the ListenTCP
>> processor appears to have been written such that it gets all the requests
>> that have been received on that socket and processes them until either it
>> has no more requests left or process or that instance of the processor is
>> no longer scheduled to run.
>> >>
>> >> Hope that helps
>> >>
>> >> Edward
>> >>
>> >> On Fri, Aug 2, 2019 at 11:28 AM Clay Teahouse <clayteahouse@gmail.com>
>> wrote:
>> >>>
>> >>> Hello All,
>> >>>
>> >>> I need to listen to and process thousands of persistent TCP
>> connections. I have 10 nodes, each having 8 cores.
>> >>> My understanding is that with existing NiFi listening processors,
>> such as ListnSyslog, a thread is utilized for each TCP connection. Does
>> this scale? Do I need to write a custom processor that utilizes a thread
>> pool for reading the data from the socket and processing them?
>> >>>
>> >>> thanks
>> >>> Clay
>>
>

Mime
View raw message