Well, you can use coalesce() to decrease number of partition to 1. 
(It will take time and quite not efficient, tough)


On Mon Jan 12 2015 at 7:57:39 PM Hafiz Mujadid [via Apache Spark User List] <[hidden email]> wrote:
Hi experts!

I have a schemaRDD of messages to be pushed in kafka. So I am using following piece of code to do that

rdd.foreachPartition(itr => {
                                val props = new Properties()
                                props.put("metadata.broker.list", brokersList)
                                props.put("serializer.class", "kafka.serializer.StringEncoder")
                                props.put("compression.codec", codec.toString)
                                props.put("producer.type", "sync")
                                props.put("batch.num.messages", BatchSize.toString)
                                props.put("message.send.max.retries", maxRetries.toString)
                                props.put("request.required.acks", "-1")
                                producer = new Producer[String, String](new ProducerConfig(props))
                                itr.foreach(row => {
                                        val msg = row.toString.drop(1).dropRight(1)
                                        this.synchronized {
                                                producer.send(new KeyedMessage[String, String](Topic, msg))

the problem with this code is that it creates kafka producer separate for each partition and I want a single producer for all partitions. Is there any way to achieve this?

If you reply to this email, your message will be added to the discussion below:
To unsubscribe from Apache Spark User List, click here.

View this message in context: Re: creating a single kafka producer object for all partitions
Sent from the Apache Spark User List mailing list archive at Nabble.com.