storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Navin Ipe <navin....@searchlighthealth.com>
Subject How are multiple spouts and fields grouping planned out?
Date Sun, 24 Apr 2016 17:44:19 GMT
To parallelize some code, I considered having this topology. The single
[Spout] or [Bolt] represent multiple Spouts or Bolts.

*[Spout]--emit--->[Bolt A]--emit--->[Bolt B]*

If any of the bolts in Bolt A emit a Tuple of value 1, and it gets
processed by a certain bolt in Bolt B, then it is imperative that if any of
the bolts in Bolt A again emits the value 1, it should compulsorily be
processed by the same bolt in Bolt B. I assume fields grouping can handle
this.

To have many spouts work in parallel, my initial thoughts were to have:











*        Integer numberOfSpouts = 10;                String
partialSpoutName = "mongoSpout";        String partialBoltName =
"mongoBolt";                        for(Integer i = 0; i < numberOfSpouts;
++i) {            String spoutName = partialSpoutName +
i.toString();            String boltName = partialBoltName +
i.toString();                        builder.setSpout(spoutName, new
MongoSpout());            builder.setBolt(boltName, new
GenBolt()).shuffleGrouping(spoutName);        }*

But I realized it was probably not the right way, because in case I wanted
all of Bolt A's tuples to go to a single Bolt B, then I'd have to include
cases like this:



















*        switch (numberOfSpouts) {            case 3:
builder.setBolt("sqlWriterBolt", new
SqlWriterBolt(appConfig),3)
.shuffleGrouping(partialBoltName+"2")
.shuffleGrouping(partialBoltName+"1")
.shuffleGrouping(partialBoltName+"0");
                break;            case 2:
builder.setBolt("sqlWriterBolt", new
SqlWriterBolt(appConfig),2)
.shuffleGrouping(partialBoltName+"1")
.shuffleGrouping(partialBoltName+"0");
                break;            case 1:
builder.setBolt("sqlWriterBolt", new
SqlWriterBolt(appConfig),1)
.shuffleGrouping(partialBoltName+"0");
break;            default:                System.out.println("No case
here");        }//switch*

*So three questions:*
*1. *The way I'm using the for loop above is wrong isn't it? If I use a
single builder.setSpout and set the numTasks value, then Storm would create
those many Spout instances automatically?

*2.* When you specify something like this for fields grouping:
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("A","B","C","D"));
    }
What does it mean? Does it mean that 4 types of tuples might be emitted?
When they are received as builder.setBolt("*sqlWriterBolt*", new Bolt_AB(),
3).fieldsGrouping("*mongoBolt*", new Fields("A", "B"));

Does it mean that my the first 2 tuples will go to the Bolt_AB and the next
2 tuples may go to any other bolt that receives tuples from mongoBolt, and
then the next two tuples will go to Bolt_AB again? Is that how it works?

*3. *If I don't specify any new Fields("A", "B"), how does Storm know the
grouping? How does it decide? If I have a tuple like this:
class SomeTuple extends IRichBolt {
  private Integer someID;
  public Integer getSomeID() {return someID;}
}
and if Bolt A sends SomeTuple to Bolt B, with SomeID value (assume a SomeID
value is 9) and the next time Bolt A generates a Tuple with someID = 9 (and
it may generate many tuples with someID=9), how do I ensure that Storm sees
the SomeID value and decides to send it to the Bolt B instance that
processes all someID's which have a value of 9?


-- 
Regards,
Navin

Mime
View raw message