nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy LoPresto <alopresto.apa...@gmail.com>
Subject Re: Combining outputs in parallel returns a random output
Date Thu, 02 Jun 2016 06:12:55 GMT
The "parallel" flow you have written isn't actually parallel, it's just "independent". Each
of the three processors will perform its intended function and pass a flowfile containing
*the JSON element it is responsible for* to the destination. Unfortunately, the two components
that might seem helpful -- a funnel or MergeContent processor -- do not achieve what you are
looking for. 

If parallel performance of the three ExecuteStreamCommand processors is really that important,
I would recommend using the ExecuteScript processor with a Groovy script that uses an individual
thread for each action (either via standard thread management or something like GPars for
easy parallel execution). 

In this case, barring an absolute requirement for "parallel" activity, I would recommend using
linear connections between the three ESC processors, and getting the final result of the additive
operation into the destination processor. 

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 1, 2016, at 22:25, Kavinaesh Dheenadayalan <kavinaesh@gmail.com> wrote:
> 
> Hi Team,
> 
> I am new to nifi and working on a demo to my colleagues. 
> 
> I am creating a json by using this flow. The outputs of ExecuteStreamCommand processor
which are json attributes are combined to create a json.
> When these three ExecuteStreamCommand processors are connected sequentially to an AttributesToJSON
processor, the flow works fine with all the three attributes getting populated in the final
json. 
> But when I connect these three ExecuteStreamCommand processors in parallel, only one
random attribute gets populated in the final json.
> Please let me know if there is a way to connect these processors in parallel and get
all the attributes in the final json. scheduling or merging or any other option? 
> 
> I have attached the templates, screen shots of the processors for your reference.
> 
> Wrong output when the flow is parallel:
> 
> { "checks": [  {  "Check_Type": "File_Delimiter_Validation",  "value": null }, { "Check_Type":
"File_Header_Validation", "value": "0" },  { "Check_Type": "File_Trailer_Validation",  "value":
null } ], "fileName": "input.txt" }
> 
> 
> Correct output when the flow is sequential:
> 
> { "checks": [  {  "Check_Type": "File_Delimiter_Validation",  "value": "50" }, { "Check_Type":
"File_Header_Validation", "value": "0" },  { "Check_Type": "File_Trailer_Validation",  "value":
"100" } ], "fileName": "input.txt" }
> 
> <parallel not working.jpg>
> <sequential working.jpg>
> <attributesJson.JPG>
> <ExecuteStreamCommand.JPG>
> <sequential_flow.xml>
> <parallel_flow.xml>

Mime
View raw message