nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: Generate flowfiles from flowfile content
Date Wed, 23 Sep 2015 23:04:17 GMT
Sorry I missed Joe's email while sending mine... I can put together a
template showing this.

On Wednesday, September 23, 2015, Bryan Bende <bbende@gmail.com> wrote:

> David,
>
> Take a look at ExtractText, it is for pulling FlowFile content into
> attributes. I think that will do what you are looking for.
>
> -Bryan
>
> On Wednesday, September 23, 2015, David Klim <davidklmlg@hotmail.com
> <javascript:_e(%7B%7D,'cvml','davidklmlg@hotmail.com');>> wrote:
>
>> Hello Bryan,
>>
>> I should have been more specific. What I am trying to do is to fetch
>> files from S3. I am using the GetSQS processor to get new object (files)
>> events, and each event is a json containing the list of new objects (files)
>> in the bucket. The output of the GetSQS is processed by SplitJson and I get
>> flowfiles containing one object key (filename) each. I need to feed this
>> into FetchS3Object to retrive the actual file, but FetchS3Object expects
>> the flowfile filename attribute (or any other) to be the filename. So I
>> guess the problem is moving the filename string from the flowfile content
>> to some attribute.
>>
>> If there is no other alternative, I will implement this processor.
>>
>> Thanks!
>>
>> ------------------------------
>> From: rbraddy@softnas.com
>> To: users@nifi.apache.org
>> Subject: RE: Generate flowfiles from flowfile content
>> Date: Wed, 23 Sep 2015 19:59:21 +0000
>>
>> Good idea, Adam.
>>
>>
>>
>> I will post a separate review thread on the dev@ list to track comments.
>>
>>
>>
>> Here’s the repository link:  https://github.com/rickbraddy/nifishare
>>
>>
>>
>>
>>
>> Thanks
>>
>> Rick
>>
>>
>>
>> *From:* Adam Taft [mailto:adam@adamtaft.com]
>> *Sent:* Wednesday, September 23, 2015 1:48 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Not speaking for the entire community, but I am sure that such a
>> contribution would (at minimum) be appreciated for review, consideration
>> and potential inclusion.  The best thing would be ideally hosting the
>> source code somewhere that the rest of the community could go to for
>> review.  Maybe you could host the GetFileData and PutFileData processors on
>> a GitHub repository somewhere?
>>
>> I think the idea you proposed is good, but might need to be aligned with
>> the work (if any) for the referenced ListFile and FetchFile
>> implementation.  And the differences in your PutFileData vs. PutFile would
>> ideally be well vetted as well.
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rbraddy@softnas.com> wrote:
>>
>> We have already developed modified a modified GetFIle called GetFileData
>> that takes an incoming FlowFile containing the path to the file/directory
>> that needs to be transferred.  There is a corresponding PutFileData on the
>> other side that accepts the incoming file/directory that creates the
>> directory/tree as needed or writes the file, then sets the permissions and
>> ownership.  GetFileData also receives a file.rootdir attribute that gets
>> passed along to PutFileData, so it can rebase the original file’s location
>> relative to the configured target directory.  Unlike GetFile/PutFile, these
>> processor work with entire directory trees and are triggered by incoming
>> FlowFiles to GetFileData.
>>
>>
>>
>> Eventually, we want to further enhance these two processors so they can
>> break large files into “chunks” and send as multi-part files that get
>> reassembled by PutFileData, resolving the limitations associated with huge
>> files and content repository size; e.g., there are default 100MB chunk
>> threshold and 10MB chunk size properties that will control the chunking, if
>> enabled.
>>
>>
>>
>> If the community is interested would benefit from these processors, we’re
>> happy to consider further generalizing and contributing these processors,
>> along with any further refinements based upon community review and feedback.
>>
>>
>>
>> I believe these processors would address both the Jira and David’s
>> original inquiry.
>>
>>
>>
>> Rick
>>
>>
>>
>> *From:* Adam Taft [mailto:adam@adamtaft.com]
>> *Sent:* Wednesday, September 23, 2015 1:09 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Generate flowfiles from flowfile content
>>
>>
>>
>> Right.  This would be the use case that FetchFile [1] would help solve.
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-631
>>
>>
>>
>> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bbende@gmail.com> wrote:
>>
>> Hi David,
>>
>>
>>
>> When you say "files I need to retrieve", are you referring to files on
>> the local filesystem where NiFi is running?
>>
>>
>>
>> If so, I am not aware of an existing processor that does that. Currently
>> we have GetFile which polls a directory, but that is not what you want here.
>>
>>
>>
>> It would be fairly straight forward to implement with a custom processor
>> though... You would read the incoming FlowFile content to get the filename,
>> then create a new FlowFile with your desired name, and write the content of
>> the local file to the new FlowFile.
>>
>>
>>
>> -Bryan
>>
>>
>>
>>
>>
>> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <davidklmlg@hotmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> In a flow I am defining, I receive a flowfile containing json
>> string. Using the splitJson processor I can extract some json paths
>> pointing to some files I need to retrieve, but the filename is the content
>> of the generated flowfile. So I would need to be able to read the content
>> and generate a flowfile with that name instead. How could I do that?
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> Sent from Gmail Mobile
>


-- 
Sent from Gmail Mobile

Mime
View raw message