nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: Am I doing this right? with regarding to records
Date Tue, 01 May 2018 19:59:52 GMT
Unfortunately the current JSON record readers are not expecting a JSON
document per line because technically that is not a valid JSON
document itself. Your file would have to be represented as an array of
documents like [ doc1, doc2, doc3, ...]

There is a PR up to support the per-line JSON document though:
https://github.com/apache/nifi/pull/2640

In both of your examples, if you are splitting before partitioning,
then what is the partitioning accomplishing?

If you had the changes in the PR above then the goal would be to not
use SplitRecord... you would just send GetFile -> PartitionRecord ->
to whatever else.


On Tue, May 1, 2018 at 3:34 PM, Juan Sequeiros <hellojuan@gmail.com> wrote:
> Hello all,
>
> I have one file on local disk with thousands of lines each representing
> valid JSON object.
> My flow is like this:
>
> GetFile > SplitText > PartitionRecord ( based on a key ) >  MergeRecord >
> PutElasticSearchRecord
>
> This works well, however, I seem to bottleneck at PartitionRecord
>
> So I looked at using
> GetFile > ConvertRecord > SplitRecord > PartitionRecord
>
> But it seems to only convert the first line of the content from my GetFile.
>
> Am I missing something?
>
> I have a bottleneck that could very well be a system resource issue, but
> still, what is the best way to take a file with lines of JSON and convert
> them into records? I assume its through the record readers and writers, and
> then its implied that it converts it "object" based on the AvroSchema ( in
> my case)?
>
>

Mime
View raw message