nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Clarke <matt.clarke....@gmail.com>
Subject Re: PutDistributedMapCache
Date Tue, 12 Jan 2016 13:29:13 GMT
Sudeep,
       I was a little off on my second scenario.  The detectduplicate
processor uses the distributedcache service all on its own.. Files that are
route through it are loaded into the cache if they do not already exist in
the cache.  if they do already exist they are routed to duplicate.  The
putDistributedCache processor was a community contribution to which there
are no processor that make use of the info that it caches.

       We should probably build a processor that would make use of the data
that can be loaded by the putDistributeCache processor.  Is there a
particular use case you are trying to solve where this would be applicable?

Thanks,
Matt

On Tue, Jan 12, 2016 at 8:11 AM, Matthew Clarke <matt.clarke.138@gmail.com>
wrote:

> Sudeep,
>     The DistributedMapCache is typically used to prevent the consumption
> of duplicate data by some of the ingest type processors (GetHBASE,
> ListHDFS, and ListSFTP).  NiFi uses the service to keep a listing of what
> has been consumed so the same files are not consumed multiple times. The
> Service can also be used to detect if duplicate data already exists within
> a NiFi Instance or cluster. This would be the scenario where some source is
> pushing data to your NiFi and perhaps they push the same data more than
> once. You want to catch these duplicates so you can perhaps kick them out
> of your flow. For this you would use the PutDistributedCache processor to
> cache all incoming data and then use the DetectDuplicate processor to find
> those duplicates.
>
>     Was there a different use case you were looking to solve using the
> Distributed cache service?
>
> Thanks,
> Matt
>
> On Tue, Jan 12, 2016 at 4:36 AM, sudeep mishra <sudeepshekharm@gmail.com>
> wrote:
>
>> Hi,
>>
>> I can cache some data to be used in NiFi flow. I can see the
>> processor PutDistributedMapCache in the documentation which saves key-value
>> pairs in DistributedMapCache for NiFi but I do not see any processor to red
>> this data. How can I read data from DistributedMapCache in my data flow?
>>
>>
>> Thanks & Regards,
>>
>> Sudeep Shekhar Mishra
>>
>>
>

Mime
View raw message