nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jorge Machado <jom...@me.com>
Subject Re: FlattenJson
Date Tue, 20 Mar 2018 14:08:51 GMT
Hi Bryan, 

thanks for the help. 
Our Flow: ExecuteSql -> convertToJSON ->  SplitJson -> ExecuteScript with attachedcode
1. 

We are now writting a custom processor that does this which is a copy of FlattenJson but instead
of putting the result into a flowfile we put it into the attributes. 
That’s why I asked if it makes sense to contribute this back



Attached code 1: 

import org.apache.commons.io.IOUtils
import java.nio.charset.*
def flowFile = session.get();
if (flowFile == null) {
    return;
}
def slurper = new groovy.json.JsonSlurper()
def attrs = [:] as Map<String,String>
session.read(flowFile,
    { inputStream ->
        def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
        def obj = slurper.parseText(text)
        obj.each {k,v ->
            if(v!=null && v.toString()!=""){
              attrs[k] = v.toString()
              }
        }
    } as InputStreamCallback)
flowFile = session.putAllAttributes(flowFile, attrs)
session.transfer(flowFile, REL_SUCCESS)

some code removed


Jorge Machado





> On 20 Mar 2018, at 15:03, Bryan Bende <bbende@gmail.com> wrote:
> 
> Ok it is still not clear what the reason for needing it in attributes
> is though... Is there another processor you are using after this that
> only works off attributes?
> 
> Just trying to understand if there is another way to accomplish what
> you want to do.
> 
> On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado <jomach@me.com> wrote:
>> We are using nifi for Workflow and we get from a database like job_status and job_name
and some nested json columns.  (30 columns)
>> We need to put it as attributes from the Flow file and not the content. For the first
part (columns without a json is done by groovy script) but then would be nice to use this
standard processor and instead of writing this to a flow content write it to attributes.
>> 
>> 
>> Jorge Machado
>> 
>> 
>> 
>> 
>> 
>>> On 20 Mar 2018, at 14:47, Bryan Bende <bbende@gmail.com> wrote:
>>> 
>>> What would be the main use case for wanting all the flattened values
>>> in attributes?
>>> 
>>> If the reason was to keep the original content, we could probably just
>>> added an original relationship.
>>> 
>>> Also, I think FlattenJson supports flattening a flow file where the
>>> root is an array of JSON documents (although I'm not totally sure), so
>>> you'd have to consider what to do in that case.
>>> 
>>> On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard
>>> <pierre.villard.fr@gmail.com> wrote:
>>>> No I do see how this could be convenient in some cases. My comment was
>>>> more: you can certainly submit a PR for that feature, but it'll need to be
>>>> clearly documented using the appropriate annotations, documentation, and
>>>> property descriptions.
>>>> 
>>>> 2018-03-20 10:20 GMT+01:00 Jorge Machado <jomach@me.com>:
>>>> 
>>>>> Hi Pierre, I’m aware of that. So This means the change would not be
>>>>> accepted correct ?
>>>>> 
>>>>> Regards
>>>>> 
>>>>> Jorge Machado
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 20 Mar 2018, at 09:54, Pierre Villard <pierre.villard.fr@gmail.com>
>>>>> wrote:
>>>>>> 
>>>>>> Hi Jorge,
>>>>>> 
>>>>>> I think this should be carefully documented to remind users that
the
>>>>>> attributes are in memory. Doing what you propose would mean having
in
>>>>>> memory the full content of the flow file as long as the flow file
is
>>>>>> processed in the workflow (unless you remove attributes using
>>>>>> UpdateAttributes).
>>>>>> 
>>>>>> Pierre
>>>>>> 
>>>>>> 2018-03-20 7:55 GMT+01:00 Jorge Machado <jomach@me.com>:
>>>>>> 
>>>>>>> Hey guys,
>>>>>>> 
>>>>>>> I would like to change the FlattenJson Procerssor to be possible
to
>>>>>>> Flatten to the attributes instead of Only to content. Is this
a good
>>>>> Idea ?
>>>>>>> would the PR be accepted ?
>>>>>>> 
>>>>>>> Cheers
>>>>>>> 
>>>>>>> Jorge Machado
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message