nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mika Borner <n...@my2ndhead.com>
Subject Re: Merging Records
Date Mon, 12 Jun 2017 19:36:36 GMT
Hi Mark

Yes, this makes sense.

In my case. I'm receiving single log events from a tcp input which I 
would like to process further with record processors. This is  probably 
an edge case where a record merger would make sense to make the 
post-processing more efficient.

Good to hear it's already on the radar :-)

Mika>



On 06/12/2017 09:23 PM, Mark Payne wrote:
> Hi Mika,
>
> You're correct that there is not yet a MergeRecord processor. It is on my personal radar,
> but I've not yet gotten to it. One of the main reasons that I've not prioritized this
yet is that
> typically in this record-oriented paradigm, you'll see data coming in, in groups and
being
> processed in groups. MergeContent largely has been useful in cases where we split data
> apart (using processors like SplitText, for example), and then merge it back together
later.
> I don't see this as being quite as prominent when using record readers and writers, as
the
> readers are designed to handle streams of data instead of individual records as FlowFiles.
>
> That being said, there are certainly cases where MergeRecord still makes sense. For example,
> when you're ingesting small payloads or want to batch up to send to something like HDFS,
which
> prefers larger files, etc. So I'll hopefully have a chance to start working on that this
week or next.
>
> In the mean time, the best path forward for you may be to use MergeContent to concatenate
a bunch
> of data before the processor that is using the Grok Reader. Or, if you are splitting
the data up
> into individual records yourself, I would recommend not splitting them up at all.
>
> Does this make sense?
>
> Thanks
> -Mark
>
>
>> On Jun 12, 2017, at 3:12 PM, Mika Borner <nifi@my2ndhead.com> wrote:
>>
>> Hi,
>>
>> what is the best way to merge records? I'm using a GrokReader, that spits out single
json records. For efficiency I would like to merge a few hundred records into one flowfile.
It seems there's no MergeRecord processor yet...
>>
>> Thanks!
>>
>> Mika>
>>


Mime
View raw message