nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Josef.Zahn...@swisscom.com>
Subject Re: A question about [MergeContent] processor
Date Fri, 04 Jan 2019 10:43:34 GMT
Hi Jianan

As you have “Minimum Number of Entries: 1” it is normal that you can see merges with only
one flowfile. In my opinion the “Minimum Number of Entries” is stronger than the “Max
Bin Age” (first is written bold and second not). Additionally it is called “Max Bin Age”
and not “Bin Age”. So as soon as you reach at least 1 flowfile it could be pushed out.
However, in my opinion the documentation for “Max Bin Age” is to unspecific (when does
it really takes place?), only the developers know exactly the function behind it. Would be
great to get more information here…

Just my 2 cents. Whenever possible try to use “Merge Strategy: Defragment” instead of
the current one, but this is working only if it is predictable how many flowfiles you would
like to merge. With this strategy the max bin age makes fully sense and works as expected.

Cheers Josef


From: Jianan Zhang <william.jn.zhang@gmail.com>
Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
Date: Friday, 4 January 2019 at 11:16
To: "users@nifi.apache.org" <users@nifi.apache.org>
Subject: A question about [MergeContent] processor

Hi all,
I have a job consist of following steps: first consuming data from kafka, and then packing
data every 5 minutes into one file, finally put the packed file into hdfs.
I use the [MergeContent] processor to accomplish the “packing” step. The properties of
MergeContent I configured is list below:

----------------------
Merge Strategy: Bin-Packing Algorithm
Merge Format: Binary Concatenation
Attribute Strategy: Keep Only Common Attributes
Correlation Attribute Name: No value set
Metadata Strategy: Do Not Merge Uncommon Metadata
Minimum Number of Entries: 1
Maximum Number of Entries: 999999999
Minimum Group Size: 255 MB
Maximum Group Size:No value set
Max Bin Age: 5 minutes
Maximum number of Bins: 1
----------------------

I found the behavior of the MergeContent processor is very uncontrollable. There are serveral
workflows running on the nifi with the same configuration of MergeContent processor, some
workflows can packing the data every 5 minutes into one file correctly, but some others can’t.
It even happened that some MergeContent processor generate one flowfile per record.

I am wondering if I misunderstanding the machanism of MergeContent processor.

An newbie of nifi, please help me.

Thanks!
Mime
View raw message