spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SRK <swethakasire...@gmail.com>
Subject Slower performance while running Spark Kafka Direct Streaming with Kafka 10 cluster
Date Fri, 25 Aug 2017 19:19:07 GMT
Hi,

What would be the appropriate settings to run Spark with Kafka 10? My job
works fine with Spark with Kafka 8 and with Kafka 8 cluster. But its very
slow with Kafka 10 by using Kafka Direct' experimental APIs for Kafka 10 . I
see the following error sometimes . Please see the kafka parameters and the
consumer strategy for creating the stream below. Any suggestions on how to
run this with better performance would be of great help.

java.lang.AssertionError: assertion failed: Failed to get records for test
stream1 72 324027964 after polling for 120000

val kafkaParams = Map[String, Object](
      "bootstrap.servers" -> kafkaBrokers,
      "key.deserializer" -> classOf[StringDeserializer],
      "value.deserializer" -> classOf[StringDeserializer],
      "auto.offset.reset" -> "latest",
      "heartbeat.interval.ms" -> Integer.valueOf(20000),
      "session.timeout.ms" -> Integer.valueOf(60000),
      "request.timeout.ms" -> Integer.valueOf(90000),
      "enable.auto.commit" -> (false: java.lang.Boolean),
      "spark.streaming.kafka.consumer.cache.enabled" -> "false",
      "group.id" -> "test1"
    )

      val hubbleStream = KafkaUtils.createDirectStream[String, String](
        ssc,
        LocationStrategies.PreferConsistent,
        ConsumerStrategies.Subscribe[String, String](topicsSet, kafkaParams)
      )





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Slower-performance-while-running-Spark-Kafka-Direct-Streaming-with-Kafka-10-cluster-tp29108.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message