nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: FlattenJson
Date Tue, 20 Mar 2018 14:30:16 GMT
Ok so I guess it depends whether you end up needing all 30 fields as
attributes to achieve the logic in your flow, or if you only need a
couple.

If you only need a couple you could probably use EvaluateJsonPath
after FlattenJson to extract just the couple of fields you need into
attributes.

If you need them all then I guess it makes sense to want the option to
flatten into attributes.

On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado <jomach@me.com> wrote:
> From there on  we use a lot of routeOnAttritutes and use that values on sql queries to
other tables like select * from someTable where id=${myExtractedAttribute}
> To be honest I tryed JoltTransformJSON but I could not get it working :)
>
> Jorge Machado
>
>
>
>
>
>> On 20 Mar 2018, at 15:12, Matt Burgess <mattyb149@apache.org> wrote:
>>
>> I think Bryan is asking about what happens AFTER this part of the
>> flow. For example, if you are doing routing you can use QueryRecord
>> (and you won't need the SplitJson), if you are doing transformations
>> you can use JoltTransformJSON (often without SplitJson as well), etc.
>>
>> Regards,
>> Matt
>>
>> On Tue, Mar 20, 2018 at 10:08 AM, Jorge Machado <jomach@me.com> wrote:
>>> Hi Bryan,
>>>
>>> thanks for the help.
>>> Our Flow: ExecuteSql -> convertToJSON ->  SplitJson -> ExecuteScript
with attachedcode 1.
>>>
>>> We are now writting a custom processor that does this which is a copy of FlattenJson
but instead of putting the result into a flowfile we put it into the attributes.
>>> That’s why I asked if it makes sense to contribute this back
>>>
>>>
>>>
>>> Attached code 1:
>>>
>>> import org.apache.commons.io.IOUtils
>>> import java.nio.charset.*
>>> def flowFile = session.get();
>>> if (flowFile == null) {
>>>    return;
>>> }
>>> def slurper = new groovy.json.JsonSlurper()
>>> def attrs = [:] as Map<String,String>
>>> session.read(flowFile,
>>>    { inputStream ->
>>>        def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>>>        def obj = slurper.parseText(text)
>>>        obj.each {k,v ->
>>>            if(v!=null && v.toString()!=""){
>>>              attrs[k] = v.toString()
>>>              }
>>>        }
>>>    } as InputStreamCallback)
>>> flowFile = session.putAllAttributes(flowFile, attrs)
>>> session.transfer(flowFile, REL_SUCCESS)
>>>
>>> some code removed
>>>
>>>
>>> Jorge Machado
>>>
>>>
>>>
>>>
>>>
>>>> On 20 Mar 2018, at 15:03, Bryan Bende <bbende@gmail.com> wrote:
>>>>
>>>> Ok it is still not clear what the reason for needing it in attributes
>>>> is though... Is there another processor you are using after this that
>>>> only works off attributes?
>>>>
>>>> Just trying to understand if there is another way to accomplish what
>>>> you want to do.
>>>>
>>>> On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado <jomach@me.com> wrote:
>>>>> We are using nifi for Workflow and we get from a database like job_status
and job_name and some nested json columns.  (30 columns)
>>>>> We need to put it as attributes from the Flow file and not the content.
For the first part (columns without a json is done by groovy script) but then would be nice
to use this standard processor and instead of writing this to a flow content write it to attributes.
>>>>>
>>>>>
>>>>> Jorge Machado
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On 20 Mar 2018, at 14:47, Bryan Bende <bbende@gmail.com> wrote:
>>>>>>
>>>>>> What would be the main use case for wanting all the flattened values
>>>>>> in attributes?
>>>>>>
>>>>>> If the reason was to keep the original content, we could probably
just
>>>>>> added an original relationship.
>>>>>>
>>>>>> Also, I think FlattenJson supports flattening a flow file where the
>>>>>> root is an array of JSON documents (although I'm not totally sure),
so
>>>>>> you'd have to consider what to do in that case.
>>>>>>
>>>>>> On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard
>>>>>> <pierre.villard.fr@gmail.com> wrote:
>>>>>>> No I do see how this could be convenient in some cases. My comment
was
>>>>>>> more: you can certainly submit a PR for that feature, but it'll
need to be
>>>>>>> clearly documented using the appropriate annotations, documentation,
and
>>>>>>> property descriptions.
>>>>>>>
>>>>>>> 2018-03-20 10:20 GMT+01:00 Jorge Machado <jomach@me.com>:
>>>>>>>
>>>>>>>> Hi Pierre, I’m aware of that. So This means the change
would not be
>>>>>>>> accepted correct ?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> Jorge Machado
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 20 Mar 2018, at 09:54, Pierre Villard <pierre.villard.fr@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Jorge,
>>>>>>>>>
>>>>>>>>> I think this should be carefully documented to remind
users that the
>>>>>>>>> attributes are in memory. Doing what you propose would
mean having in
>>>>>>>>> memory the full content of the flow file as long as the
flow file is
>>>>>>>>> processed in the workflow (unless you remove attributes
using
>>>>>>>>> UpdateAttributes).
>>>>>>>>>
>>>>>>>>> Pierre
>>>>>>>>>
>>>>>>>>> 2018-03-20 7:55 GMT+01:00 Jorge Machado <jomach@me.com>:
>>>>>>>>>
>>>>>>>>>> Hey guys,
>>>>>>>>>>
>>>>>>>>>> I would like to change the FlattenJson Procerssor
to be possible to
>>>>>>>>>> Flatten to the attributes instead of Only to content.
Is this a good
>>>>>>>> Idea ?
>>>>>>>>>> would the PR be accepted ?
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>> Jorge Machado
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>>
>

Mime
View raw message