spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Grover (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API
Date Tue, 19 Jan 2016 19:32:40 GMT

    [ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107267#comment-15107267
] 

Mark Grover commented on SPARK-12177:
-------------------------------------

Thanks Mario! 
bq. We should also have a python/pyspark/streaming/kafka-v09.py as well that matches to our
external/kafka-v09
I agree, I will look into this.
bq. Why do you have the Broker.scala class? Unless i am missing something, it should be knocked
off
Yeah, I noticed that too and I agree. This should be pretty simple to take out. I also [noticed|https://issues.apache.org/jira/browse/SPARK-12177?focusedCommentId=15089750&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15089750]
that the v09 example picking up some Kafka v08 jars so I am working on fixing that too.
bq. I think the package should be 'org.apache.spark.streaming.kafka' only in external/kafka-v09
and not 'org.apache.spark.streaming.kafka.v09'. This is because we produce a jar with a diff
name (user picks which one and even if he/she mismatches, it errors correctly since the KafkaUtils
method signatures are different)
I totally understand what you mean. However, kafka has its [own assembly in Spark|https://github.com/apache/spark/tree/master/external/kafka-assembly]
and the way the code is structured right now, both the new API and old API would go in the
same assembly so it's important to have a different package name. Also, I think for our end
users transitioning from old to new API, I foresee them having 2 versions of their spark-kafka
app. One that works with the old API and one with the new API. And, I think it would be an
easier transition if they could include both the kafka API versions in the spark classpath
and pick and choose which app to run without mucking with maven dependencies and re-compiling
when they want to switch. Let me know if you disagree.

> Update KafkaDStreams to new Kafka 0.9 Consumer API
> --------------------------------------------------
>
>                 Key: SPARK-12177
>                 URL: https://issues.apache.org/jira/browse/SPARK-12177
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 1.6.0
>            Reporter: Nikita Tarasenko
>              Labels: consumer, kafka
>
> Kafka 0.9 already released and it introduce new consumer API that not compatible with
old one. So, I added new consumer api. I made separate classes in package org.apache.spark.streaming.kafka.v09
with changed API. I didn't remove old classes for more backward compatibility. User will not
need to change his old spark applications when he uprgade to new Spark version.
> Please rewiew my changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message