nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: Ingest Original data from External system by data's dependent condition
Date Tue, 13 Oct 2015 15:22:09 GMT
We do have an idea that we called HoldFile that hasn't been fully
implemented yet, but has come up a few times:
https://issues.apache.org/jira/browse/NIFI-190

The idea was basically for a processor to "hold" a FlowFile until it was
signaled by another processor to release it.
Seems like this is similar to the ClaimCheck idea and could play into the
scenarios being discussed... hold format A, convert to format B, add some
attributes to B, then release A, transferring those attributes to A.


On Tue, Oct 13, 2015 at 11:08 AM, Oleg Zhurakousky <
ozhurakousky@hortonworks.com> wrote:

> Great points Joe!
>
> One point I want to add to the discussion. . .
>
> As I am still learning the internals of the NiFi, the use case at the core
> of this thread is actually a very common EIP problem and while Aggregator
> (Merger) receiving from multiple inbound sources is one approach, it is not
> the only one.
> Another pattern that would probably fit better here is the ClaimCheck in
> combination with MessageStore.
> The way it would work is like this:
> - Original FlowFile (Message) is stored in MessageStore with the given key
> (ClaimCheck) which becomes an attribute to be passed downstream
> - Somewhere downstream whenever you ready for aggregation, use the
> ClaimCheck to access MessageStore to get the original Message to perform
> aggregation or whatever else.
>
> The general benefit is that accessing the original message may be required
> not only for aggregation but for any variety of use cases. Having
> ClaimCheck will give access to the original message to anyone who has it.
>
> So, I wan to use this as an opportunity to ask a wider NiFi group (since I
> am still learning it myself) if such pattern is supported? I know there is
> a ContentRepository so I am assuming it would’t be that difficult
>
> Cheers
> Oleg
>
> > On Oct 13, 2015, at 10:56 AM, Joe Witt <joe.witt@gmail.com> wrote:
> >
> > Lot of details passing by here but...
> >
> > Given formats A,B...Z coming in the following capabilities are
> > generally desired:
> > 1) Extract attributes of each event
> > 2) Make routing decisions on each event based on those extracted
> attributes
> > 3) Deliver raw/unmodified data to some endpoint (like HDFS)
> > 4) Convert/Transform data to some normalized format (and possibly schema
> too).
> > 5) Deliver converted data to some endpoint.
> >
> > Steps #1 and #4 involve (naturally) custom work for formats that are
> > not something we can readily support out of the box such as XML, JSON,
> > AVRO, etc...  Even the workaround suggested really only works for the
> > case where you know the original format well enough and we can support
> > it which means we'd like not have needed the workaround anyway.  So,
> > the issue remains that custom work is required for #1 and #4 cases...
> > Now, if you have packed formats that you think we could support please
> > let us know and we can see about some mechanism of dealing with those
> > formats generically - would be a power user tool of course but
> > avoiding custom work is great when achievable with the right user
> > experience/capability mix.
> >
> > Thanks
> > Joe
> >
> > On Tue, Oct 13, 2015 at 10:06 AM, yejug <msparysh@gmail.com> wrote:
> >> Ok,
> >>
> >> Thank you guys for assistance.
> >>
> >> Looks like Joe's suggestion more appropriate for me, but there is one
> BUT,
> >> in case 'ExtractXYZAttributes' we must implement implicit parsing of
> encoded
> >> message and cannot reuse this logic, e.g. if we will want do actual
> XXX ->
> >> Json (for example json =)) convertion in future.
> >>
> >> With 99,9% in my case, except AVRO there will be more inputs (as minimum
> >> msgpack and some custom binary formats), which must be parsed as well as
> >> stored in the original input format
> >>
> >> So I think, except ConvertXXXToJson + Andrew's workaround there no more
> >> alternatives for me now
> >>
> >> Thanks again
> >>
> >>
> >>
> >> --
> >> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Ingest-Original-data-from-External-system-by-data-s-dependent-condition-tp3093p3101.html
> >> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message