nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: "Processor requires an upstream connection" for FetchS3Object?
Date Wed, 13 Jan 2016 03:04:35 GMT
Russell/Corey,

In 0.4.0 there is a new way for processors to indicate what they expect as
far as input, it can be required, allowed, or forbidden. This prevents
scenarios like ExecuteSQL which at one point required an input FlowFile,
but the processor could be running and started with out an incoming
connection, and it wasn't clear why it wasn't doing anything. So now if a
processor says input is required, then it will be considered invalid if
there are no incoming connections.

In the case of FetchS3, there is definitely intent to have a ListS3, but
there are still ways to use it with out that...

One scenario might be to retrieve the same bucket on a timer, maybe once a
day or every hour... this could be done with a GenerateFlowFile processor
scheduled to the appropriate interval and feeding into FetchS3, with
FetchS3 containing a hard-coded bucket id. GenerateFlowFile would be acting
as the trigger here.

A second scenario might be to receive messages from somewhere else (JMS,
Kafka, HTTP) which contain bucket ids, and feed these FlowFiles into
FetchS3. The bucket id on FetchS3 could use expression language to
reference a bucket id on the incoming FlowFile.

Hope this helps.

-Bryan

On Tue, Jan 12, 2016 at 9:39 PM, Corey Flowers <cflowers@onyxpoint.com>
wrote:

> Hello Russell,
>
>        Sorry if that seemed short, I was running in to pick my son up
> from  my practice. What I meant to say was that you are correct. Although I
> haven't worked on those processors, I do believe it is expecting the listS3
> processor to function and that is why you are getting that error. I would
> love to know if there is a work around because I would also love to work
> with these processors within the next week or so.
>
> On Tue, Jan 12, 2016 at 9:16 PM, Russell Whitaker <
> russell.whitaker@gmail.com> wrote:
>
>> On Tue, Jan 12, 2016 at 6:11 PM, Corey Flowers <cflowers@onyxpoint.com>
>> wrote:
>> > Ha ha! Well that would do it! :)
>> >
>>
>> I don't know what that would "do" other than confirm that the
>> FetchS3Object processor shipped
>> with v0.4.1 needs its doc to reflect the fact it's not yet useable
>> untl a ListS3* processor is implemented
>> and included in the narfile for the distribution.
>>
>> Russell
>>
>> > Sent from my iPhone
>> >
>> >> On Jan 12, 2016, at 9:10 PM, Russell Whitaker <
>> russell.whitaker@gmail.com> wrote:
>> >>
>> >>> On Tue, Jan 12, 2016 at 6:02 PM, Corey Flowers <
>> cflowers@onyxpoint.com> wrote:
>> >>> I haven't worked with this processor but I believe it is looking for
>> >>> the S3 list processor to generate the list of objects to fetch. Did
>> >>> you try that yet?
>> >>
>> >> I mentioned this: "There's no "ListS3Object" processor type which
>> >> might hypothetically populate
>> >> attributes for FetchS3Object to act upon." I should have made this
>> >> doubly explicit that I checked
>> >> in the processor creation dialogue.
>> >>
>> >> Also, this:
>> >>
>> https://mail-archives.apache.org/mod_mbox/nifi-users/201510.mbox/%3CD23C06E8.CA0%25chakrader.dewaragatla@lifelock.com%3E
>> >>
>> >> "There is already a ticket
>> >> (NIFI-840<https://issues.apache.org/jira/browse/NIFI-840>)
>> >> in the hopper to create a ListS3Objects processor that can track
>> >> bucket contents and trigger
>> >> FetchS3Object."
>> >>
>> >> Oh god, it does appear that v0.4.1 ships with an implemented
>> >> FetchS3Object processor but no
>> >> List processor to feed it:
>> >>
>> >> https://issues.apache.org/jira/browse/NIFI-840
>> >>
>> >> Status: unresolved
>> >>
>> >> Description: "A processor is needed that can provide an S3 listing to
>> >> use in conjunction with FetchS3Object. This is to provide a similar
>> >> user experience as with the HDFS processors that perform List/Get."
>> >>
>> >> I think this means I'm horked. And the Relationships section of the
>> >> FetchS3Object doc is still wrong.
>> >>
>> >> Russell
>> >>
>> >>
>> >>> Sent from my iPhone
>> >>>
>> >>>> On Jan 12, 2016, at 8:38 PM, Russell Whitaker <
>> russell.whitaker@gmail.com> wrote:
>> >>>>
>> >>>> I'm running v0.4.1 Nifi, and seeing this (taken from nifi-app.log,
>> >>>> also seeing on mouseover of the "!" icon on the processor on the
>> >>>> canvas):
>> >>>>
>> >>>> 2016-01-12 17:08:50,357 ERROR [NiFi Web Server-18]
>> >>>> o.a.nifi.groups.StandardProcessGroup Unable to start
>> >>>> FetchS3Object[id=f4253204-a2e2-4ce6-ba09-9415e8024dca] due to {}
>> >>>> java.lang.IllegalStateException: Processor FetchS3Object is not
in a
>> >>>> valid state due to ['Upstream Connections' is invalid because
>> >>>> Processor requires an upstream connection but currently has none]
>> >>>>
>> >>>> Per:
>> >>>>
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.aws.s3.FetchS3Object/index.html
>> ,
>> >>>> FetchS3Object "Retrieves the contents of an S3 Object and writes
it
>> to
>> >>>> the content of a FlowFile," which would seem to indicate this is
an
>> >>>> "edge" processor that doesn't expect a flowfile from an upstream
>> >>>> processor.
>> >>>>
>> >>>> The "Tags" on the doc are: "Amazon, S3, AWS, Get, Fetch"
>> >>>>
>> >>>> The processor configuration settings themselves strongly indicate
it
>> >>>> expects to connect to S3 using the supplied
>> >>>> credentials/bucket/objectkey settings, with no upstream processor.
>> >>>>
>> >>>> But I get this error. What am I missing? There's no GetS3Object
>> >>>> anymore; surely this is the edge processor for directly downloading
>> S3
>> >>>> objects, yes? There's no "ListS3Object" processor type which might
>> >>>> hypothetically populate attributes for FetchS3Object to act upon.
>> >>>>
>> >>>> Also, there are these obviously incorrect copy/paste lines in the
>> >>>> Relationships area of the API doc referenced above:
>> >>>>
>> >>>> "success - FlowFiles are routed to success after being successfully
>> >>>> copied to Amazon S3"
>> >>>> "failure - FlowFiles are routed to failure if unable to be copied
to
>> Amazon S3"
>> >>>>
>> >>>> No, that's obviously lifted from the PutS3Object doc page, where
it's
>> >>>> actually correct:
>> >>>>
>> >>>>
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.aws.s3.PutS3Object/index.html
>> >>>>
>> >>>> Anyone have any insight into this? Thanks in advance.
>> >>>>
>> >>>> Russell
>> >>
>> >> --
>> >> Russell Whitaker
>> >> http://twitter.com/OrthoNormalRuss
>> >> http://www.linkedin.com/pub/russell-whitaker/0/b86/329
>>
>>
>>
>> --
>> Russell Whitaker
>> http://twitter.com/OrthoNormalRuss
>> http://www.linkedin.com/pub/russell-whitaker/0/b86/329
>>
>
>
>
> --
> Corey Flowers
> Vice President, Onyx Point, Inc
> (410) 541-6699
> cflowers@onyxpoint.com
>
> -- This account not approved for unencrypted proprietary information --
>

Mime
View raw message