nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: DistributedMapCacheServer questions
Date Thu, 29 Nov 2018 18:33:03 GMT
I also meant to add that NiFi does provide a "state manager" API to
processors, which when clustered will use ZooKeeper.

The difference between this and DMC, is that the state for a processor
is only accessible to the given processor (or all the instances of the
processor across the cluster). It is stored by the processor's UUID.

So if the state doesn't need to be shared across different parts of
the flow, then you can use this instead. You can look at
ProcesContext.getStateManager()

On Thu, Nov 29, 2018 at 1:08 PM Boris Tyukin <boris@boristyukin.com> wrote:
>
> thanks for the explanation, Bryan! it helps!
>
> Boris
>
> On Thu, Nov 29, 2018 at 12:26 PM Bryan Bende <bbende@gmail.com> wrote:
>>
>> Boris,
>>
>> Yes the "distributed" name is confusing... it is referring to the fact
>> that it is a cache that can be accessed across the cluster, rather
>> than a local cache on each node, but you are correct that that DMC
>> server is a single point of failure.
>>
>> It is important to separate the DMC client and server, there are
>> multiple implementations of the DMC client that can interact with
>> different caches (Redis, HBase, etc), the trade-off being you then
>> have to run/maintain these external systems, instead of the DMC server
>> which is fully managed by NiFi.
>>
>> Regarding ZK... I don't think there is a good answer other than the
>> fact that DMC existed when NiFi was open sourced, and NiFi didn't
>> start using ZK for clustering until the 1.0.0 release, so originally
>> ZK wasn't in the picture. I assume we could implement a DMC client
>> that talked to ZK, just like we have done for Redis, HBase, and
>> others.
>>
>> I'm not aware of any issues with the DMC server persisting to file
>> system or handling concurrent connections, it should be stable.
>>
>> Thanks,
>>
>> Bryan
>>
>> On Thu, Nov 29, 2018 at 11:52 AM Boris Tyukin <boris@boristyukin.com> wrote:
>> >
>> > Hi guys,
>> >
>> > I have a few questions about DistributedMapCacheServer.
>> >
>> > First question, I am confused by "Distributed" part. If I get it, the server
actually runs on a single node and if it fails, it is game over. Is that right? Why NiFi is
not using ZK for that since ZK is already used by NiFi cluster? I see most of the use cases
/ examples are about using DistributedMapCacheServer as a lookup or state store and this is
exactly what ZK was designed for and provides redundancy, scalability and 5-10k ops per sec
on 3 node ZK cluster.
>> >
>> > Second question, I did not find any tools to interact with it other than Matt's
groovy tool.
>> >
>> > Third question, how DistributedMapCacheServer that persists to file system,
handles concurrency and locking? Is it reliable and can be trusted?
>> >
>> > And lastly, is there additional overhead to support DistributedMapCacheServer
as another system or it is pretty much hands off once a controller is set up?
>> >
>> > Thanks!
>> > Boris

Mime
View raw message