spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Kafka directsream receiving rate
Date Fri, 05 Feb 2016 19:03:04 GMT
How are you counting the number of messages?

I'd go ahead and remove the settings for backpressure and
maxrateperpartition, just to eliminate that as a variable.

On Fri, Feb 5, 2016 at 12:22 PM, Diwakar Dhanuskodi <
diwakar.dhanuskodi@gmail.com> wrote:

> I am  using  one  directsream. Below  is  the  call  to directsream:-
>
> val topicSet = topics.split(",").toSet
> val kafkaParams = Map[String,String]("bootstrap.servers" -> "
> datanode4.isdp.com:9092")
> val k =
> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder](ssc,
> kafkaParams, topicSet)
>
> When  I replace   DirectStream call  to  createStream,  all  messages were
>  read  by  one  Dstream block.:-
> val k = KafkaUtils.createStream(ssc, "datanode4.isdp.com:2181","resp",topicMap
> ,StorageLevel.MEMORY_ONLY)
>
> I am  using   below  spark-submit to execute:
> ./spark-submit --master yarn-client --conf
> "spark.dynamicAllocation.enabled=true" --conf
> "spark.shuffle.service.enabled=true" --conf
> "spark.sql.tungsten.enabled=false" --conf "spark.sql.codegen=false" --conf
> "spark.sql.unsafe.enabled=false" --conf
> "spark.streaming.backpressure.enabled=true" --conf "spark.locality.wait=1s"
> --conf "spark.shuffle.consolidateFiles=true"   --conf
> "spark.streaming.kafka.maxRatePerPartition=1000000" --driver-memory 2g
> --executor-memory 1g --class com.tcs.dime.spark.SparkReceiver   --files
> /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/etc/hadoop/conf/mapred-site.xml,/etc/hadoop/conf/yarn-site.xml,/etc/hive/conf/hive-site.xml
> --jars
> /root/dime/jars/spark-streaming-kafka-assembly_2.10-1.5.1.jar,/root/Jars/sparkreceiver.jar
> /root/Jars/sparkreceiver.jar
>
>
>
>
> Sent from Samsung Mobile.
>
>
> -------- Original message --------
> From: Cody Koeninger <cody@koeninger.org>
> Date:05/02/2016 22:07 (GMT+05:30)
> To: Diwakar Dhanuskodi <diwakar.dhanuskodi@gmail.com>
> Cc: user@spark.apache.org
> Subject: Re: Kafka directsream receiving rate
>
> If you're using the direct stream, you have 0 receivers.  Do you mean you
> have 1 executor?
>
> Can you post the relevant call to createDirectStream from your code, as
> well as any relevant spark configuration?
>
> On Thu, Feb 4, 2016 at 8:13 PM, Diwakar Dhanuskodi <
> diwakar.dhanuskodi@gmail.com> wrote:
>
>> Adding more info
>>
>> Batch  interval  is  2000ms.
>> I expect all 100 messages  go thru one  dstream from  directsream but it
>> receives at rate of 10 messages at time. Am  I missing  some
>>  configurations here. Any help appreciated.
>>
>> Regards
>> Diwakar.
>>
>>
>> Sent from Samsung Mobile.
>>
>>
>> -------- Original message --------
>> From: Diwakar Dhanuskodi <diwakar.dhanuskodi@gmail.com>
>> Date:05/02/2016 07:33 (GMT+05:30)
>> To: user@spark.apache.org
>> Cc:
>> Subject: Kafka directsream receiving rate
>>
>> Hi,
>> Using spark 1.5.1.
>> I have a topic with 20 partitions.  When I publish 100 messages. Spark
>> direct stream is receiving 10 messages per  dstream. I have  only  one
>>  receiver . When I used createStream the  receiver  received  entire 100
>> messages  at once.
>>
>> Appreciate  any  help .
>>
>> Regards
>> Diwakar
>>
>>
>> Sent from Samsung Mobile.
>>
>
>

Mime
View raw message