storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 王 纯超 <>
Subject Re: Re: is stateful bolts production ready?
Date Fri, 11 Aug 2017 10:15:31 GMT
Thanks Arun, that explains a lot.


From: Arun Mahadevan<>
Date: 2017-08-11 17:50
To: Wijekoon, Manusha<>
Subject: Re: is stateful bolts production ready?
If you want to use the provided state implementations, you don’t need to do any of what
you mentioned. You bolt would be initialed with its last know state in “initState” and
the bolt can keep updating the state in “execute". The framework will automatically save
the state to the state backend periodically. See StatefulTopology[1] for example.

Right now Storm supports Redis and Hbase as state backends. If you are want your own state
backend, you need to implement the  get/put/delete and the logic for prepare/commit/rollback
etc. See Hbase[2] and Redis[3] state implementations to get a better idea. Anyways I don’t
think Kafka would be ideal as a KV state backend since its not easy to do KV lookups without
loading all the data into memory or you put some KV store on top of it.

>In addition to the query, what is the intent of stateful bolt since we can just hold state
in bolt instance?

It mostly automates what you would have to implement otherwise and also ensures that the state
is saved consistently across the whole topology (i.e. If you have multiple bolts with state,
all of their states are saved in an atomic manner).



From: 王 纯超 <<>>
Reply-To: "<>" <<>>
Date: Friday, August 11, 2017 at 11:21 AM
To: "Wijekoon, Manusha" <<>>,
"<>" <<>>
Subject: Re: RE: is stateful bolts production ready?

In addition to the query, what is the intent of stateful bolt since we can just hold state
in bolt instance?


From: Wijekoon, Manusha<>
Date: 2017-08-10 18:56
Subject: RE: is stateful bolts production ready?
In our case we prefer to use our own state implementation. After going through the code and
reading documentation, following is how I understand it. Could you please see if my understanding
is correct?

1. Derive from State and provide an implementation. In the commit (txID) method are we supposed
to persists the state by our selves or does the framework take care of that? If it is taken
care of by the framework, how do we add our own persisting mechanism - for example one that
use Kafka to persist state?
2. Subclass StateProvider to return State objects for namespaces of interest. For example,
in our case, we wish to use a custom state class in one of the bolts and use defaults for
spouts. In this case, is it safe to return custom states for the bolt in concern and use the
default state provider (InMemoryKeyValueStateProvider) for other namespaces? Is the custom
provider supposed to load last saved state for the namespace in concern from the persistent
store. Again if state persistence is handled by framework, how do we know where to get state
3. Are checkpoint related methods called by the same bolt or spout thread?


From: Arun Iyer [<>] on behalf of Arun
Mahadevan [<>]
Sent: Monday, July 24, 2017 2:29 PM
Subject: Re: is stateful bolts production ready?

The bolt just needs to “put” the values into the Key-Value state that the bolt gets initialized
with during “initState”. The framework automatically takes care of saving the state behind
the scenes.

Theres an example in storm-starter that you might find useful -<>

You can also find the more elaborate documentation here -<>


From: "Wijekoon, Manusha" <<><>>
Reply-To: "<><>"
Date: Monday, July 24, 2017 at 4:04 PM
To: "<><>"
Subject: is stateful bolts production ready?


I am thinking of using stateful bolts to manage state of a bolt. From the documentation it
is not clear how to save the bolt state however. I understand it has to be done when we process
the checkpoint tuple, but how? Do I just need to update the state object and storm pick it
up during three phase commit? How does Strom know which state object to pick for check pointing?

I wasn’t able to fine more complete examples either, specifically when we can’t keep the
state in a kev/value map.

Also, Is this functionality tested in production like environments before?

View raw message