Can you please take thread dump and see if there are blocking issues on
producer side ? Do you have single instance of Producers and Multiple
treads ?
Are you using Scala Producer or New Java Producer ? Also, what is your
producer property ?
> Hi Gwen,
> I have changed the java code kafkawordcount to use reducebykeyandwindow in
> spark.
> Not sure about the throughput, but:
>
> "I mean that the words counted in spark should grow up"  The spark
> wordcount example doesn't accumulate.
> It gets an RDD every n seconds and counts the words in that RDD. So we
> don't expect the count to go up.
>
>
> > Hi Guys,
> > Anyone could explain me how to work Kafka with Spark, I am using the
> > JavaKafkaWordCount.java like a test and the line command is:
> >
> > ./runexample org.apache.spark.streaming.examples.JavaKafkaWordCount
> > spark://192.168.0.13:7077 computer49:2181 testconsumergroup unibs.it 3
> >
> > and like a producer I am using this command:
> >
> > rdkafka_cachesender t unibs.nec p 1 b 192.168.0.46:9092 f output.txt
> > l 100 n 10
> >
> > rdkafka_cachesender is a program that was developed by me which send to
> > kafka the output.txt’s content where l is the length of each send(upper
> > bound) and n is the lines to send in a row. Bellow is the throughput
> > calculated by the program:
> >
> > File is 2235755 bytes
> > throughput (b/s) = 699751388
> > throughput (b/s) = 723542382
> > throughput (b/s) = 662989745
> > throughput (b/s) = 505028200
> > throughput (b/s) = 471263416
> > throughput (b/s) = 446837266
> > throughput (b/s) = 409856716
> > throughput (b/s) = 373994467
> > throughput (b/s) = 366343097
> > throughput (b/s) = 373240017
> > throughput (b/s) = 386139016
> > throughput (b/s) = 373802209
> > throughput (b/s) = 369308515
> > throughput (b/s) = 366935820
> > throughput (b/s) = 365175388
> > throughput (b/s) = 362175419
> > throughput (b/s) = 358356633
> > throughput (b/s) = 357219124
> > throughput (b/s) = 352174125
> > throughput (b/s) = 348313093
> > throughput (b/s) = 355099099
> > throughput (b/s) = 348069777
> > throughput (b/s) = 348478302
> > throughput (b/s) = 340404276
> > throughput (b/s) = 339876031
> > throughput (b/s) = 339175102
> > throughput (b/s) = 327555252
> > throughput (b/s) = 324272374
> > throughput (b/s) = 322479222
> > throughput (b/s) = 319544906
> > throughput (b/s) = 317201853
> > throughput (b/s) = 317351399
> > throughput (b/s) = 315027978
> > throughput (b/s) = 313831014
> > throughput (b/s) = 310050384
> > throughput (b/s) = 307654601
> > throughput (b/s) = 305707061
> > throughput (b/s) = 307961102
> > throughput (b/s) = 296898200
> > throughput (b/s) = 296409904
> > throughput (b/s) = 294609332
> > throughput (b/s) = 293397843
> > throughput (b/s) = 293194876
> > throughput (b/s) = 291724886
> > throughput (b/s) = 290031314
> > throughput (b/s) = 289747022
> > throughput (b/s) = 289299632
> > The throughput goes down after some seconds and it does not maintain the
> > performance like the initial values:
> >
> > throughput (b/s) = 699751388
> > throughput (b/s) = 723542382
> > throughput (b/s) = 662989745
> > Another question is about spark, after I have started the spark line
> > command after 15 sec spark continue to repeat the words counted, but my
> > program continue to send words to kafka, so I mean that the words counted
> > in spark should grow up. I have attached the log from spark.
> >
> > My Case is:
> >
> > ComputerA(Kafka_cachsesender) > ComputerB(KakfaBrokersZookeeper) >
> > ComputerC (Spark)
> >
> > If I don’t explain very well send a reply to me.
> >
> > Thanks Guys
