nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Klim <davidkl...@hotmail.com>
Subject RE: Generate flowfiles from flowfile content
Date Thu, 24 Sep 2015 06:40:45 GMT
ExtractText did the job! Thank you very much! :-)

> Date: Wed, 23 Sep 2015 16:05:44 -0700
> Subject: Re: Generate flowfiles from flowfile content
> From: joe.witt@gmail.com
> To: users@nifi.apache.org
> 
> Bryan - you may be right that ExtractText will be the right play once
> splitjson is done doing its thing.  Perhaps either will work.  Maybe
> we can show either or.  If the schema is fairly well known i'm
> thinking extract json would be the winner.
> 
> thanks
> Joe
> 
> On Wed, Sep 23, 2015 at 4:04 PM, Bryan Bende <bbende@gmail.com> wrote:
> > Sorry I missed Joe's email while sending mine... I can put together a
> > template showing this.
> >
> >
> > On Wednesday, September 23, 2015, Bryan Bende <bbende@gmail.com> wrote:
> >>
> >> David,
> >>
> >> Take a look at ExtractText, it is for pulling FlowFile content into
> >> attributes. I think that will do what you are looking for.
> >>
> >> -Bryan
> >>
> >> On Wednesday, September 23, 2015, David Klim <davidklmlg@hotmail.com>
> >> wrote:
> >>>
> >>> Hello Bryan,
> >>>
> >>> I should have been more specific. What I am trying to do is to fetch
> >>> files from S3. I am using the GetSQS processor to get new object (files)
> >>> events, and each event is a json containing the list of new objects (files)
> >>> in the bucket. The output of the GetSQS is processed by SplitJson and I
get
> >>> flowfiles containing one object key (filename) each. I need to feed this
> >>> into FetchS3Object to retrive the actual file, but FetchS3Object expects
the
> >>> flowfile filename attribute (or any other) to be the filename. So I guess
> >>> the problem is moving the filename string from the flowfile content to some
> >>> attribute.
> >>>
> >>> If there is no other alternative, I will implement this processor.
> >>>
> >>> Thanks!
> >>>
> >>> ________________________________
> >>> From: rbraddy@softnas.com
> >>> To: users@nifi.apache.org
> >>> Subject: RE: Generate flowfiles from flowfile content
> >>> Date: Wed, 23 Sep 2015 19:59:21 +0000
> >>>
> >>> Good idea, Adam.
> >>>
> >>>
> >>>
> >>> I will post a separate review thread on the dev@ list to track comments.
> >>>
> >>>
> >>>
> >>> Here’s the repository link:  https://github.com/rickbraddy/nifishare
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>> Rick
> >>>
> >>>
> >>>
> >>> From: Adam Taft [mailto:adam@adamtaft.com]
> >>> Sent: Wednesday, September 23, 2015 1:48 PM
> >>> To: users@nifi.apache.org
> >>> Subject: Re: Generate flowfiles from flowfile content
> >>>
> >>>
> >>>
> >>> Not speaking for the entire community, but I am sure that such a
> >>> contribution would (at minimum) be appreciated for review, consideration
and
> >>> potential inclusion.  The best thing would be ideally hosting the source
> >>> code somewhere that the rest of the community could go to for review.  Maybe
> >>> you could host the GetFileData and PutFileData processors on a GitHub
> >>> repository somewhere?
> >>>
> >>> I think the idea you proposed is good, but might need to be aligned with
> >>> the work (if any) for the referenced ListFile and FetchFile implementation.
> >>> And the differences in your PutFileData vs. PutFile would ideally be well
> >>> vetted as well.
> >>>
> >>> Adam
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rbraddy@softnas.com>
wrote:
> >>>
> >>> We have already developed modified a modified GetFIle called GetFileData
> >>> that takes an incoming FlowFile containing the path to the file/directory
> >>> that needs to be transferred.  There is a corresponding PutFileData on the
> >>> other side that accepts the incoming file/directory that creates the
> >>> directory/tree as needed or writes the file, then sets the permissions and
> >>> ownership.  GetFileData also receives a file.rootdir attribute that gets
> >>> passed along to PutFileData, so it can rebase the original file’s location
> >>> relative to the configured target directory.  Unlike GetFile/PutFile, these
> >>> processor work with entire directory trees and are triggered by incoming
> >>> FlowFiles to GetFileData.
> >>>
> >>>
> >>>
> >>> Eventually, we want to further enhance these two processors so they can
> >>> break large files into “chunks” and send as multi-part files that get
> >>> reassembled by PutFileData, resolving the limitations associated with huge
> >>> files and content repository size; e.g., there are default 100MB chunk
> >>> threshold and 10MB chunk size properties that will control the chunking,
if
> >>> enabled.
> >>>
> >>>
> >>>
> >>> If the community is interested would benefit from these processors, we’re
> >>> happy to consider further generalizing and contributing these processors,
> >>> along with any further refinements based upon community review and feedback.
> >>>
> >>>
> >>>
> >>> I believe these processors would address both the Jira and David’s
> >>> original inquiry.
> >>>
> >>>
> >>>
> >>> Rick
> >>>
> >>>
> >>>
> >>> From: Adam Taft [mailto:adam@adamtaft.com]
> >>> Sent: Wednesday, September 23, 2015 1:09 PM
> >>> To: users@nifi.apache.org
> >>> Subject: Re: Generate flowfiles from flowfile content
> >>>
> >>>
> >>>
> >>> Right.  This would be the use case that FetchFile [1] would help solve.
> >>>
> >>> [1] https://issues.apache.org/jira/browse/NIFI-631
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bbende@gmail.com> wrote:
> >>>
> >>> Hi David,
> >>>
> >>>
> >>>
> >>> When you say "files I need to retrieve", are you referring to files on
> >>> the local filesystem where NiFi is running?
> >>>
> >>>
> >>>
> >>> If so, I am not aware of an existing processor that does that. Currently
> >>> we have GetFile which polls a directory, but that is not what you want here.
> >>>
> >>>
> >>>
> >>> It would be fairly straight forward to implement with a custom processor
> >>> though... You would read the incoming FlowFile content to get the filename,
> >>> then create a new FlowFile with your desired name, and write the content
of
> >>> the local file to the new FlowFile.
> >>>
> >>>
> >>>
> >>> -Bryan
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <davidklmlg@hotmail.com>
> >>> wrote:
> >>>
> >>> Hello,
> >>>
> >>>
> >>>
> >>> In a flow I am defining, I receive a flowfile containing json string.
> >>> Using the splitJson processor I can extract some json paths pointing to
some
> >>> files I need to retrieve, but the filename is the content of the generated
> >>> flowfile. So I would need to be able to read the content and generate a
> >>> flowfile with that name instead. How could I do that?
> >>>
> >>>
> >>>
> >>> Thanks!
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Sent from Gmail Mobile
> >
> >
> >
> > --
> > Sent from Gmail Mobile
 		 	   		  
Mime
View raw message