metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laurens Vets <laur...@daemon.be>
Subject Re: [DISCUSS] Using JSON Path to support more complex documents with the JSONMap Parser
Date Fri, 26 Jan 2018 16:51:30 GMT
On 2018-01-25 07:57, Otto Fowler wrote:
> While it would be preferred if all data streamed into the parsers is
> already in ‘stream’ form, as opposed to ‘batched’ form, it may not 
> always
> be possible, or possible at every step of system development.
> 
> I was wondering if it would be worth adding optional support to the 
> JSONMap
> Parser to support more complex documents, and split them in the parser 
> into
> multiple messages. This is similar in function to the JSON Splitter
> processor in NiFi
> 
> So, a document would come into the JSONMap Parser from Kafka, with some
> embedded set of the real message content, such as in this simplified
> example:
> 
> {
>     “messages" : [
>         { message1},
>         { message2},
>         ….
>         {messageN}
>     ]
> }
> 
> the JSONMap Parser, would have a new configuration item for message
> selection, that would be a JSON Path expression
> 
> “messageSelector” : “$.messages “
> 
> Inside the JSONMap Parser, it would evaluate the expression, and do the
> same processing on each item returned by the expression list.
> 
> the Parser interface already supports returning multiple message 
> objects
> from a single byte[] input.
> 
> There is a performance penalty to be paid here, and it is more than 
> just
> doing more than one message due to the JSONPath evaluation.
> 
> I can see this being useful in a couple of circumstances:
> 
>    -
> 
>    You want to work with some document format with metron but do not 
> have
>    NiFi or the equivalent available or setup yet
>    -
> 
>    You want to prototype with Metron before you get the ‘preprocessing’
>    setup
>    -
> 
>    You are not going to be able to use NiFi and are ok with the 
> performance
> 
> I have something in github to look at for more detail :
> ottobackwards/json-path-play
> <https://github.com/ottobackwards/json-path-play>
> 
> Thoughts?

I like this, it's the exact reason why we use NiFi Splitter right now. 
We get 'batched' CloudTrail events which need to be split in individual 
events...

Mime
View raw message