Hi,

I know Storm is designed to run forever. I also know about Trident's technique of aggregation. But shouldn't Storm have a way to let bolts know that a certain bunch of processing has been completed?

Consider this topology:
Spout------>Bolt-A------>Bolt-B
            |                  |--->Bolt-B
            |                  \--->Bolt-B
            |--->Bolt-A------>Bolt-B
            |                  |--->Bolt-B
            |                  \--->Bolt-B
            \--->Bolt-A------>Bolt-B
                               |--->Bolt-B
                               \--->Bolt-B

  • From Bolt-A to Bolt-B, it is a FieldsGrouping.
  • Spout emits only a few tuples and then stops emitting.
  • Bolt A takes those tuples and generates millions of tuples.

Bolt-B accumulates tuples that Bolt A sends and needs to know when Spout finished emitting. Only then can Bolt-B start writing to SQL.

Questions:
1. How can all Bolts B be notified that it is time to write to SQL?
2. After all Bolts B have written to SQL, how to know that all Bolts B have completed writing?
3. How to stop the topology? I know of localCluster.killTopology("HelloStorm"), but shouldn't there be a way to do it from the Bolt?

--
Regards,
Navin