storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Junguk Cho <>
Subject Re: Basic questions about Strom
Date Sat, 11 Jun 2016 01:57:42 GMT
Hi, Jungtaek.

Thank you for reply.

I have following questions.

1. If we look at the example (WordCountTopology), in WordCount class, it
uses   String word = tuple.getString(0); to get string (word).
So, I don't understand exact roles of  "word" and "count". Internally, they
use them for Map-like structure?
To be clear, does each bolt exchange data with this format  "word" : <data>

About default and non-default stream, do all tuples include stream id
whenever they send?

3. To be clear, if we set "false", storm does not use serialization for
inter-process and inter-node?

Thanks in advance.
- Junguk

2016-06-10 18:00 GMT-04:00 Jungtaek Lim <>:

> Hi Junguk,
> 1. In declareOutputFields, you're declaring schema of output stream of
> this component. First value of tuple will be matched to "word", and second
> value of tuple will be matched to "count". You can access value as field
> name or index.
> Btw, declare() declares default stream, and there're other methods which
> declare named (non-default) stream.
> 2. When you're rebalancing topology, you're encouraged to input wait-time,
> too.
> Topology will be deactivated immediately so that Spout will not call
> nextTuple(), only Bolts will be running to handle on-going tuples while
> wait-time.
> If there're still on-going tuples left, they will not be acked. So if
> datasource of Spout is RabbitMQ with ack mode or Kafka or so on, Spout will
> read them from datasource again.
> 3. Right. In order to check serialization issue earlier, there's option
> "topology.testing.always.try.serialize" as debug purpose. Note that it
> affects performance so it should be disabled ("false" by default) for
> production environment.
> Hope this helps.
> Thanks,
> Jungtaek Lim (HeartSaVioR)
> 2016년 6월 11일 (토) 오전 3:27, Junguk Cho <>님이 작성:
>> Hi, I have some basic questions.
>> 1. About Tuple.
>> We declare tuple in declareOutputFields.
>> For example, declarer.declare(new Fields("word", "count"));
>> Are "word" and "count" forwarded to next node with actual data?
>> What are the roles of "word" and "count" here internally?
>> 2. About rebalancing (
>> )
>> In storm, there is rebalancing capability.
>> What happened on-going tuples while storm rebalances topology?
>> Does it drop and replay?
>> 3. Serialization.
>> In storm, as far as I know for inter-thread communication, serialization
>> does not happen. For inter-process and inter-node communication,
>> serialization is required.
>> Is it right?
>> Thanks,
>> Junguk

View raw message