kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 刘明敏 <diveintotomor...@gmail.com>
Subject encoding issue in kafka
Date Sat, 02 Jun 2012 10:07:37 GMT
we encountered an encoding issue when dealing with Chinese character

the producer send characters in right encode(UTF-8),while after the
consumer get it ,it all turns into question marks:????

when start up producer,kafka broker server and consumer, we tried specified
-Dfile.encoding=UTF-8,but it doesn't work

In producer,we use StringEncoder,below is the snippet of producer:

  val props = new Properties();


  props.put("serializer.class", "kafka.serializer.StringEncoder");
  props.put("compression.codec", "1") //gzip

  val producerConfig = new ProducerConfig(props);
  val producer = new Producer[String, String](producerConfig);
    val data = new ProducerData[String, String](topic, partitionKey,


and consumer:

    val topicMessageStreams =
consumerConnector.createMessageStreams(Predef.Map(topic -> consumers),
new StringDecoder)

    for ((topic, streamList) <- topicMessageStreams) {
      for (stream <- streamList) {
        val processor = new StreamProcessor(stream)

        new Thread(processor).start();

and the StreamProcessor just iterate each streams
  val message = iterator.next.message//chinese characters in message
turns into ?????

Anyone any help?

Best Regards

刘明敏 | mmLiu

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message