nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Hughes <nicholasmhughes.n...@gmail.com>
Subject Re: "Flatten" JSON
Date Tue, 19 Sep 2017 16:36:23 GMT
Created an issue for this functionality [1]. Please change issue properties
and comment as necessary.

-Nick

[1] https://issues.apache.org/jira/browse/NIFI-4398


On Sat, Sep 16, 2017 at 4:55 PM, Matt Burgess <mattyb149@apache.org> wrote:

> +1 for FlattenRecord as well. In the meantime you can use
> ExecuteScript or InvokeScriptedProcessor, I have a Groovy script
> (albeit for a different product) that does the flatten [1].
>
> Regards,
> Matt
>
> [1] http://funpdi.blogspot.com/2014/10/flatten-json-to-key-
> value-pairs-in-pdi.html
>
> On Fri, Sep 15, 2017 at 9:33 AM, Kevin Doran <kdoran.apache@gmail.com>
> wrote:
> > +1 for adding a FlattenRecord processor. I can think of a few scenarios
> in which it would be quite useful, and it would be convenient if it could
> be accomplished without JOLT.
> >
> > Thanks,
> > Kevin
> >
> > On 9/15/17, 09:16, "Nicholas Hughes" <nicholasmhughes@gmail.com on
> behalf of nicholasmhughes.nifi@gmail.com> wrote:
> >
> >     Mark,
> >
> >     I'm definitely for making the processor as generic as possible. I
> don't
> >     mind chaining together a few simple processors to get a job done
> (such as
> >     convert JSON to Avro > infer schema > flatten records)... I just
> don't want
> >     steps get super complex... and the Jolt Transform processor does
> seem very
> >     powerful and very complex.
> >
> >     If there's some support for a "FlattenRecord" processor, I can
> submit the
> >     Jira containing the meat of this thread.
> >
> >     -Nick
> >
> >
> >     On Fri, Sep 15, 2017 at 9:01 AM, Mark Payne <markap14@hotmail.com>
> wrote:
> >
> >     > Nick,
> >     >
> >     > I do believe that there's a way to do what you're asking with Jolt,
> >     > without knowing any kind of schema.
> >     > That said, Jolt can get complex pretty quickly and I don't know it
> well
> >     > :)  Personally, I have no problem with having a
> >     > FlattenRecord processor. I guess the question here, though, is are
> you
> >     > using Record-oriented processors,
> >     > or are you using JSON-specific processors?
> >     >
> >     > Personally, I'd like to see a FlattenRecord processor, rather than
> >     > FlattenJSON, because that would allow
> >     > the transformation to apply to Avro as well (and as soon as we get
> an XML
> >     > reader built, XML also). However,
> >     > the Record-oriented processors would expect that a schema be given
> (though
> >     > it could also be inferred using
> >     > another existing processor).
> >     >
> >     > -Mark
> >     >
> >     >
> >     >
> >     > > On Sep 15, 2017, at 7:43 AM, Nicholas Hughes <
> >     > nicholasmhughes.nifi@gmail.com> wrote:
> >     > >
> >     > > Is there an easy way to "flatten" arbitrary JSON within NiFi?
> >     > >
> >     > > For input data like that shown below from Yahoo [1]
> >     > >
> >     > > {
> >     > >  "query": {
> >     > >    "count": 1,
> >     > >    "created": "2017-09-15T11:20:26Z",
> >     > >    "lang": "en-US",
> >     > >    "results": {
> >     > >      "channel": {
> >     > >        "item": {
> >     > >          "condition": {
> >     > >            "code": "33",
> >     > >            "date": "Fri, 15 Sep 2017 06:00 AM EDT",
> >     > >            "temp": "63",
> >     > >            "text": "Mostly Clear"
> >     > >          }
> >     > >        }
> >     > >      }
> >     > >    }
> >     > >  }
> >     > > }
> >     > >
> >     > >
> >     > > ...I'd like to end up with output something like this:
> >     > >
> >     > > {
> >     > >  "query.count": 1,
> >     > >  "query.created": "2017-09-15T11:20:26Z",
> >     > >  "query.lang": "en-US",
> >     > >  "query.results.channel.item.condition.code": "33",
> >     > >  "query.results.channel.item.condition.date": "Fri, 15 Sep 2017
> 06:00
> >     > AM EDT",
> >     > >  "query.results.channel.item.condition.temp": "63",
> >     > >  "query.results.channel.item.condition.text": "Mostly Clear"
> >     > > }
> >     > >
> >     > >
> >     > > I checked out the JoltTransformJSON processor and some examples,
> such as
> >     > > the nested data to "prefix soup" demo [2], but it seems as
> though I need
> >     > to
> >     > > enter information about the schema for the incoming data in
> order to
> >     > > transform it. Ideally, I'd like to have a processor "just figure
> it out"
> >     > > without explicit entry of a schema.
> >     > >
> >     > > Is there any way to accomplish this in a generic way with
> >     > JoltTransformJSON
> >     > > (or another native processor)?
> >     > >
> >     > > If not, would a ticket requesting a "Field Flattener" processor
> much like
> >     > > the one included in StreamSets Data Collector [3] be worthwhile?
> >     > >
> >     > > Thanks in advance!
> >     > >
> >     > > -Nick
> >     > >
> >     > >
> >     > > [1]
> >     > > https://query.yahooapis.com/v1/public/yql?q=select%20item.
> >     > condition%20from%20weather.forecast%20where%20woeid%20%
> >     > 3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org%
> >     > 2Falltableswithkeys
> >     > >
> >     > > [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup
> >     > >
> >     > > [3]
> >     > > https://github.com/streamsets/datacollector/tree/master/
> >     > basic-lib/src/main/java/com/streamsets/pipeline/stage/
> >     > processor/fieldflattener
> >     >
> >     >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message