spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Du Li <l...@yahoo-inc.com.INVALID>
Subject distribution of receivers in spark streaming
Date Wed, 04 Mar 2015 23:34:06 GMT
Hi,
I have a set of machines (say 5) and want to evenly launch a number (say 8) of kafka receivers
on those machines. In my code I did something like the following, as suggested in the spark
docs:        val streams = (1 to numReceivers).map(_ => ssc.receiverStream(new MyKafkaReceiver())) 
      ssc.union(streams)
However, from the spark UI, I saw that some machines are not running any instance of the receiver
while some get three. The mapping changed every time the system was restarted. This impacts
the receiving and also the processing speeds.
I wonder if it's possible to control/suggest the distribution so that it would be more even.
How is the decision made in spark?
Thanks,Du


Mime
View raw message