nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <marka...@hotmail.com>
Subject Re: "Flatten" JSON
Date Fri, 15 Sep 2017 13:01:09 GMT
Nick,

I do believe that there's a way to do what you're asking with Jolt, without knowing any kind
of schema.
That said, Jolt can get complex pretty quickly and I don't know it well :)  Personally, I
have no problem with having a
FlattenRecord processor. I guess the question here, though, is are you using Record-oriented
processors,
or are you using JSON-specific processors?

Personally, I'd like to see a FlattenRecord processor, rather than FlattenJSON, because that
would allow
the transformation to apply to Avro as well (and as soon as we get an XML reader built, XML
also). However,
the Record-oriented processors would expect that a schema be given (though it could also be
inferred using
another existing processor).

-Mark



> On Sep 15, 2017, at 7:43 AM, Nicholas Hughes <nicholasmhughes.nifi@gmail.com> wrote:
> 
> Is there an easy way to "flatten" arbitrary JSON within NiFi?
> 
> For input data like that shown below from Yahoo [1]
> 
> {
>  "query": {
>    "count": 1,
>    "created": "2017-09-15T11:20:26Z",
>    "lang": "en-US",
>    "results": {
>      "channel": {
>        "item": {
>          "condition": {
>            "code": "33",
>            "date": "Fri, 15 Sep 2017 06:00 AM EDT",
>            "temp": "63",
>            "text": "Mostly Clear"
>          }
>        }
>      }
>    }
>  }
> }
> 
> 
> ...I'd like to end up with output something like this:
> 
> {
>  "query.count": 1,
>  "query.created": "2017-09-15T11:20:26Z",
>  "query.lang": "en-US",
>  "query.results.channel.item.condition.code": "33",
>  "query.results.channel.item.condition.date": "Fri, 15 Sep 2017 06:00 AM EDT",
>  "query.results.channel.item.condition.temp": "63",
>  "query.results.channel.item.condition.text": "Mostly Clear"
> }
> 
> 
> I checked out the JoltTransformJSON processor and some examples, such as
> the nested data to "prefix soup" demo [2], but it seems as though I need to
> enter information about the schema for the incoming data in order to
> transform it. Ideally, I'd like to have a processor "just figure it out"
> without explicit entry of a schema.
> 
> Is there any way to accomplish this in a generic way with JoltTransformJSON
> (or another native processor)?
> 
> If not, would a ticket requesting a "Field Flattener" processor much like
> the one included in StreamSets Data Collector [3] be worthwhile?
> 
> Thanks in advance!
> 
> -Nick
> 
> 
> [1]
> https://query.yahooapis.com/v1/public/yql?q=select%20item.condition%20from%20weather.forecast%20where%20woeid%20%3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys
> 
> [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup
> 
> [3]
> https://github.com/streamsets/datacollector/tree/master/basic-lib/src/main/java/com/streamsets/pipeline/stage/processor/fieldflattener


Mime
View raw message