nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elli Schwarz <eliezer_schw...@yahoo.com>
Subject Re: MergeContent Processor
Date Mon, 23 Nov 2015 17:09:32 GMT
 Thank you for your help. I'm still a bit confused: if a bin's min entries has been reached,
but not the max age, wouldn't it immediately merge the flowfiles? You mention 5 successive
unsuccessful attempts - but what would cause unsuccessful attempts after the min entries has
been reached? I don't understand how the max entries property would ever come in to play.


For example, if I have a min bin entries of 25, max of 100, and max age of 5 minutes. What
scenario would cause 100 files to be included in a single merge content if I get 100 flowfiles
hitting the processor within 1 minute? (If it matters, I'm using bin-packing algorithms and
bin concatenation). 

I want to have a better understanding of the processor to adjust these settings to obtain
the optimal configuration. What I would like the processor to do is, even after min entries
has been reached, wait a certain period of time to see if there are more files. This is not
max bin size - I view that as a failsafe mechanism so that something doesn't stay in the queue
forever if the minimums are not reached.

Thanks!


    On Thursday, November 19, 2015 8:44 AM, Aldrin Piri <aldrinpiri@gmail.com> wrote:
 
 

 Elli,
Your understanding of the functionality is correct. There are a couple of criteria that drive
when a bin is "done." In this case, if you establish the optional maximum properties, these
drive in closing out sooner.  That is if a max age is specified, and any of the bins have
gone beyond that time, they will be closed and transferred out.
Alternatively, a bin is also considered ready if the max age has not yet elapsed and:
* both minimum size and minimum number of files has been reached and a few successive attempts
to add to the bin (specifically, five) have been unsuccessful, signaling that it is nearly
full or the objects are ill suited for tighter packing.
* size or number of entries is greater than or equal to their respective, optionally, specified
maximum 

Let us know if you have any other questions!
On Thu, Nov 19, 2015 at 8:09 AM, Elli Schwarz <eliezer_schwarz@yahoo.com> wrote:

Hello,
I'm a bit confused about the relationship of certain properties of the MergeContent processor.
Specifically, how do the properties min entries, max entries, max bin age, max number of bins
interact? If the MergeContent processor receives the min number of entries, does it merge
without waiting for max bin age? Or does max bin age trump the other properties? If max bin
age is hit before the min number of entries, does the processor wait until it gets the min
number? Does it merge once it gets to the max bin age, regardless of whether or not the max
entries has been received? What about min/max group size vs. min/max number of entries?

I want to make sure that the processor isn't waiting forever (ie, will send after 10 minutes
no matter what) if there's only 1 flowfile in the queue. If I set max bin age to 10 minutes,
and min entries to 10, what does that mean, it seems to work the way I expect, which makes
me wonder what does the min entries property mean if it doesn't seem to be used?
Thank you for any clarifications possible. I looked through the documentation for this processor,
but it doesn't seem to explain these crucial details, which greatly impact my strategy for
using this processor properly.
-Elli





 
  
Mime
View raw message