nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Burgess <mattyb...@apache.org>
Subject Re: "Flatten" JSON
Date Sat, 16 Sep 2017 20:55:28 GMT
+1 for FlattenRecord as well. In the meantime you can use
ExecuteScript or InvokeScriptedProcessor, I have a Groovy script
(albeit for a different product) that does the flatten [1].

Regards,
Matt

[1] http://funpdi.blogspot.com/2014/10/flatten-json-to-key-value-pairs-in-pdi.html

On Fri, Sep 15, 2017 at 9:33 AM, Kevin Doran <kdoran.apache@gmail.com> wrote:
> +1 for adding a FlattenRecord processor. I can think of a few scenarios in which it would
be quite useful, and it would be convenient if it could be accomplished without JOLT.
>
> Thanks,
> Kevin
>
> On 9/15/17, 09:16, "Nicholas Hughes" <nicholasmhughes@gmail.com on behalf of nicholasmhughes.nifi@gmail.com>
wrote:
>
>     Mark,
>
>     I'm definitely for making the processor as generic as possible. I don't
>     mind chaining together a few simple processors to get a job done (such as
>     convert JSON to Avro > infer schema > flatten records)... I just don't want
>     steps get super complex... and the Jolt Transform processor does seem very
>     powerful and very complex.
>
>     If there's some support for a "FlattenRecord" processor, I can submit the
>     Jira containing the meat of this thread.
>
>     -Nick
>
>
>     On Fri, Sep 15, 2017 at 9:01 AM, Mark Payne <markap14@hotmail.com> wrote:
>
>     > Nick,
>     >
>     > I do believe that there's a way to do what you're asking with Jolt,
>     > without knowing any kind of schema.
>     > That said, Jolt can get complex pretty quickly and I don't know it well
>     > :)  Personally, I have no problem with having a
>     > FlattenRecord processor. I guess the question here, though, is are you
>     > using Record-oriented processors,
>     > or are you using JSON-specific processors?
>     >
>     > Personally, I'd like to see a FlattenRecord processor, rather than
>     > FlattenJSON, because that would allow
>     > the transformation to apply to Avro as well (and as soon as we get an XML
>     > reader built, XML also). However,
>     > the Record-oriented processors would expect that a schema be given (though
>     > it could also be inferred using
>     > another existing processor).
>     >
>     > -Mark
>     >
>     >
>     >
>     > > On Sep 15, 2017, at 7:43 AM, Nicholas Hughes <
>     > nicholasmhughes.nifi@gmail.com> wrote:
>     > >
>     > > Is there an easy way to "flatten" arbitrary JSON within NiFi?
>     > >
>     > > For input data like that shown below from Yahoo [1]
>     > >
>     > > {
>     > >  "query": {
>     > >    "count": 1,
>     > >    "created": "2017-09-15T11:20:26Z",
>     > >    "lang": "en-US",
>     > >    "results": {
>     > >      "channel": {
>     > >        "item": {
>     > >          "condition": {
>     > >            "code": "33",
>     > >            "date": "Fri, 15 Sep 2017 06:00 AM EDT",
>     > >            "temp": "63",
>     > >            "text": "Mostly Clear"
>     > >          }
>     > >        }
>     > >      }
>     > >    }
>     > >  }
>     > > }
>     > >
>     > >
>     > > ...I'd like to end up with output something like this:
>     > >
>     > > {
>     > >  "query.count": 1,
>     > >  "query.created": "2017-09-15T11:20:26Z",
>     > >  "query.lang": "en-US",
>     > >  "query.results.channel.item.condition.code": "33",
>     > >  "query.results.channel.item.condition.date": "Fri, 15 Sep 2017 06:00
>     > AM EDT",
>     > >  "query.results.channel.item.condition.temp": "63",
>     > >  "query.results.channel.item.condition.text": "Mostly Clear"
>     > > }
>     > >
>     > >
>     > > I checked out the JoltTransformJSON processor and some examples, such as
>     > > the nested data to "prefix soup" demo [2], but it seems as though I need
>     > to
>     > > enter information about the schema for the incoming data in order to
>     > > transform it. Ideally, I'd like to have a processor "just figure it out"
>     > > without explicit entry of a schema.
>     > >
>     > > Is there any way to accomplish this in a generic way with
>     > JoltTransformJSON
>     > > (or another native processor)?
>     > >
>     > > If not, would a ticket requesting a "Field Flattener" processor much like
>     > > the one included in StreamSets Data Collector [3] be worthwhile?
>     > >
>     > > Thanks in advance!
>     > >
>     > > -Nick
>     > >
>     > >
>     > > [1]
>     > > https://query.yahooapis.com/v1/public/yql?q=select%20item.
>     > condition%20from%20weather.forecast%20where%20woeid%20%
>     > 3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org%
>     > 2Falltableswithkeys
>     > >
>     > > [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup
>     > >
>     > > [3]
>     > > https://github.com/streamsets/datacollector/tree/master/
>     > basic-lib/src/main/java/com/streamsets/pipeline/stage/
>     > processor/fieldflattener
>     >
>     >
>
>
>

Mime
View raw message