From dev-return-16708-apmail-nifi-dev-archive=nifi.apache.org@nifi.apache.org Tue Mar 20 16:19:21 2018 Return-Path: X-Original-To: apmail-nifi-dev-archive@minotaur.apache.org Delivered-To: apmail-nifi-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CB36018EDE for ; Tue, 20 Mar 2018 16:19:21 +0000 (UTC) Received: (qmail 52340 invoked by uid 500); 20 Mar 2018 16:19:21 -0000 Delivered-To: apmail-nifi-dev-archive@nifi.apache.org Received: (qmail 52276 invoked by uid 500); 20 Mar 2018 16:19:21 -0000 Mailing-List: contact dev-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nifi.apache.org Delivered-To: mailing list dev@nifi.apache.org Received: (qmail 52234 invoked by uid 99); 20 Mar 2018 16:19:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2018 16:19:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5C2D6C00D6 for ; Tue, 20 Mar 2018 16:19:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.129 X-Spam-Level: ** X-Spam-Status: No, score=2.129 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id CIOtp4LWnjGh for ; Tue, 20 Mar 2018 16:19:17 +0000 (UTC) Received: from mail-lf0-f66.google.com (mail-lf0-f66.google.com [209.85.215.66]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id C26BF5F1EC for ; Tue, 20 Mar 2018 16:19:16 +0000 (UTC) Received: by mail-lf0-f66.google.com with SMTP id t132-v6so3464953lfe.2 for ; Tue, 20 Mar 2018 09:19:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=fj9JjMvjgeSiyXhVFZ8J+ZkzQBPBnSWSP7yPl7yFrpk=; b=Giem7eLZkpwSsv9YFfhrU01CQCu5IWfFMRllU+kvmLy5lZVS12gvkvFzk04N+WTaAd MmJkaVRVqGUA2fOne2JWb6JIkkvWXaucbBOo6t8OgoKzZZBAkI5dvQ/MMT6t9m1BVxiF FiqK2IEvq+W7canr8HUjGXu8yVuZtoxHPAO9D1ImSzR6IIIwY0HHSuFjIOKrnkksYg7y CLV+2pb2mjeeJ0R1NlpgRW9XzJOpY6Ut41OYaW8vDdYRJNvArXp4fFOFemYb1+icFZH5 AMf0tMZELS2nPuHrfE3Fq156nC8HHoBr84Gj0m9jtvGC6uuv11pCAqy7lV3Kl1bHSdC4 SYYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=fj9JjMvjgeSiyXhVFZ8J+ZkzQBPBnSWSP7yPl7yFrpk=; b=t+cMOtw3tQwk0FRVEiX+aIAVn3xn1BVqXPXvXjcY8mOT8XEO12g6EN2UTJpMCwLuVi vBvvKPH1+eTciOeAAwc6jQLmbk45OgAR5i+Pm8XxgjmUyGObazIXBmTnPCnbyDkP7psN 3OtVUsh2nb43/pqu23tPNtDZm3DxfFLHUFKFdFX3ASKoDRJJkc5pJIQC0yznJBEFlpyr P7UzNEXu9YHBovtaME7CjNxjSAHHCAxlp7FGf94NRYO6/c3OwktrDJQkDDTxHL0UMyUL DHx1wHlunwT0Cx5spz9oztDutaWMIJqsKhMWJG+SFYPjWWJuuRYR6vyn7LU8mDQNX5j3 Tc8g== X-Gm-Message-State: AElRT7EWqSoTeRH804GGre/tVocX1Xy7D+t2vvpEsXAiBqKKpT87PW9C gzrvT76Vv0VSI0IK0gAAZixrzoJuWXVXFr3hNr1JYA== X-Google-Smtp-Source: AG47ELuSJDpmzJiS/ScIddSkWIP/LtgU3f6fhyvQipgyzJRM9+3r9dg9CXfoq65o7xlvali1OfTzNrOuonSLuodLTtQ= X-Received: by 10.46.56.2 with SMTP id f2mr1709503lja.110.1521562755260; Tue, 20 Mar 2018 09:19:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.46.104.3 with HTTP; Tue, 20 Mar 2018 09:19:14 -0700 (PDT) In-Reply-To: References: <1985C8BD-4FB6-42F4-A5CF-02D6896FEE01@me.com> <319EF5EC-970D-4215-9C75-0ABBCFBCB015@me.com> <5BCAA950-36FD-401B-8FB0-8783566AE053@me.com> <0422D682-8E0B-408D-9355-7978B87BE866@me.com> <1770C16F-24D4-45DC-B1EE-0A7CF41A6200@me.com> From: Sivaprasanna Date: Tue, 20 Mar 2018 21:49:14 +0530 Message-ID: Subject: Re: FlattenJson To: dev@nifi.apache.org Content-Type: multipart/alternative; boundary="089e0823d90c87e7650567da6fea" --089e0823d90c87e7650567da6fea Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Like the idea that Otto suggested. RoutOnJSONPath makes more sense since making the flattened JSON write to attributes is restricted to that processor alone. On Tue, Mar 20, 2018 at 8:37 PM, Otto Fowler wrote: > Why not create a new processor that does routeOnJSONPath and works on the > flow file? > > > On March 20, 2018 at 10:39:37, Jorge Machado (jomach@me.com) wrote: > > So that is what we actually are doing EvaluateJsonPath the problem with > that is, that is hard to build something generic if we need to specify ea= ch > property by his name, that=E2=80=99s why this idea. > > Should I make a PR for this or is this to business specific ? > > > Jorge Machado > > > On 20 Mar 2018, at 15:30, Bryan Bende wrote: > > > > Ok so I guess it depends whether you end up needing all 30 fields as > > attributes to achieve the logic in your flow, or if you only need a > > couple. > > > > If you only need a couple you could probably use EvaluateJsonPath > > after FlattenJson to extract just the couple of fields you need into > > attributes. > > > > If you need them all then I guess it makes sense to want the option to > > flatten into attributes. > > > > On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado wrote: > >> From there on we use a lot of routeOnAttritutes and use that values on > sql queries to other tables like select * from someTable where > id=3D${myExtractedAttribute} > >> To be honest I tryed JoltTransformJSON but I could not get it working = :) > >> > >> Jorge Machado > >> > >> > >> > >> > >> > >>> On 20 Mar 2018, at 15:12, Matt Burgess wrote: > >>> > >>> I think Bryan is asking about what happens AFTER this part of the > >>> flow. For example, if you are doing routing you can use QueryRecord > >>> (and you won't need the SplitJson), if you are doing transformations > >>> you can use JoltTransformJSON (often without SplitJson as well), etc. > >>> > >>> Regards, > >>> Matt > >>> > >>> On Tue, Mar 20, 2018 at 10:08 AM, Jorge Machado wrote= : > >>>> Hi Bryan, > >>>> > >>>> thanks for the help. > >>>> Our Flow: ExecuteSql -> convertToJSON -> SplitJson -> ExecuteScript > with attachedcode 1. > >>>> > >>>> We are now writting a custom processor that does this which is a cop= y > of FlattenJson but instead of putting the result into a flowfile we put i= t > into the attributes. > >>>> That=E2=80=99s why I asked if it makes sense to contribute this back > >>>> > >>>> > >>>> > >>>> Attached code 1: > >>>> > >>>> import org.apache.commons.io.IOUtils > >>>> import java.nio.charset.* > >>>> def flowFile =3D session.get(); > >>>> if (flowFile =3D=3D null) { > >>>> return; > >>>> } > >>>> def slurper =3D new groovy.json.JsonSlurper() > >>>> def attrs =3D [:] as Map > >>>> session.read(flowFile, > >>>> { inputStream -> > >>>> def text =3D IOUtils.toString(inputStream, StandardCharsets.UTF_8) > >>>> def obj =3D slurper.parseText(text) > >>>> obj.each {k,v -> > >>>> if(v!=3Dnull && v.toString()!=3D""){ > >>>> attrs[k] =3D v.toString() > >>>> } > >>>> } > >>>> } as InputStreamCallback) > >>>> flowFile =3D session.putAllAttributes(flowFile, attrs) > >>>> session.transfer(flowFile, REL_SUCCESS) > >>>> > >>>> some code removed > >>>> > >>>> > >>>> Jorge Machado > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>> On 20 Mar 2018, at 15:03, Bryan Bende wrote: > >>>>> > >>>>> Ok it is still not clear what the reason for needing it in attribut= es > >>>>> is though... Is there another processor you are using after this th= at > >>>>> only works off attributes? > >>>>> > >>>>> Just trying to understand if there is another way to accomplish wha= t > >>>>> you want to do. > >>>>> > >>>>> On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado > wrote: > >>>>>> We are using nifi for Workflow and we get from a database like > job_status and job_name and some nested json columns. (30 columns) > >>>>>> We need to put it as attributes from the Flow file and not the > content. For the first part (columns without a json is done by groovy > script) but then would be nice to use this standard processor and instead > of writing this to a flow content write it to attributes. > >>>>>> > >>>>>> > >>>>>> Jorge Machado > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On 20 Mar 2018, at 14:47, Bryan Bende wrote: > >>>>>>> > >>>>>>> What would be the main use case for wanting all the flattened > values > >>>>>>> in attributes? > >>>>>>> > >>>>>>> If the reason was to keep the original content, we could probably > just > >>>>>>> added an original relationship. > >>>>>>> > >>>>>>> Also, I think FlattenJson supports flattening a flow file where t= he > >>>>>>> root is an array of JSON documents (although I'm not totally sure= ), > so > >>>>>>> you'd have to consider what to do in that case. > >>>>>>> > >>>>>>> On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard > >>>>>>> wrote: > >>>>>>>> No I do see how this could be convenient in some cases. My comme= nt > was > >>>>>>>> more: you can certainly submit a PR for that feature, but it'll > need to be > >>>>>>>> clearly documented using the appropriate annotations, > documentation, and > >>>>>>>> property descriptions. > >>>>>>>> > >>>>>>>> 2018-03-20 10:20 GMT+01:00 Jorge Machado : > >>>>>>>> > >>>>>>>>> Hi Pierre, I=E2=80=99m aware of that. So This means the change = would not > be > >>>>>>>>> accepted correct ? > >>>>>>>>> > >>>>>>>>> Regards > >>>>>>>>> > >>>>>>>>> Jorge Machado > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On 20 Mar 2018, at 09:54, Pierre Villard < > pierre.villard.fr@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Hi Jorge, > >>>>>>>>>> > >>>>>>>>>> I think this should be carefully documented to remind users th= at > the > >>>>>>>>>> attributes are in memory. Doing what you propose would mean > having in > >>>>>>>>>> memory the full content of the flow file as long as the flow > file is > >>>>>>>>>> processed in the workflow (unless you remove attributes using > >>>>>>>>>> UpdateAttributes). > >>>>>>>>>> > >>>>>>>>>> Pierre > >>>>>>>>>> > >>>>>>>>>> 2018-03-20 7:55 GMT+01:00 Jorge Machado : > >>>>>>>>>> > >>>>>>>>>>> Hey guys, > >>>>>>>>>>> > >>>>>>>>>>> I would like to change the FlattenJson Procerssor to be > possible to > >>>>>>>>>>> Flatten to the attributes instead of Only to content. Is this= a > good > >>>>>>>>> Idea ? > >>>>>>>>>>> would the PR be accepted ? > >>>>>>>>>>> > >>>>>>>>>>> Cheers > >>>>>>>>>>> > >>>>>>>>>>> Jorge Machado > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>> > >>>> > >> > --089e0823d90c87e7650567da6fea--