spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabor Somogyi <gsomo...@cloudera.com.INVALID>
Subject Re: [SS] KafkaSource doesn't use KafkaSourceInitialOffsetWriter for initial offsets?
Date Mon, 26 Aug 2019 16:58:12 GMT
OK, starting with this tomorrow...

On Mon, 26 Aug 2019, 16:05 Jungtaek Lim, <kabhwan@gmail.com> wrote:

> Thanks! The patch is here: https://github.com/apache/spark/pull/25583
>
> On Mon, Aug 26, 2019 at 11:02 PM Gabor Somogyi <gabor.g.somogyi@gmail.com>
> wrote:
>
>> Just checked this and it's a copy-paste :) It works properly when
>> KafkaSourceInitialOffsetWriter used. Pull me in if review needed.
>>
>> BR,
>> G
>>
>>
>> On Mon, Aug 26, 2019 at 3:57 PM Jungtaek Lim <kabhwan@gmail.com> wrote:
>>
>>> Nice finding! I don't see any reason to not use
>>> KafkaSourceInitialOffsetWriter from KafkaSource, as they're identical. I
>>> guess it was copied and pasted sometime before and not addressed yet.
>>> As you haven't submit a patch, I'll submit a patch shortly, with
>>> mentioning credit. I'd close mine and wait for your patch if you plan to do
>>> it. Please let me know.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> On Mon, Aug 26, 2019 at 8:03 PM Jacek Laskowski <jacek@japila.pl> wrote:
>>>
>>>> Hi,
>>>>
>>>> Just found out that KafkaSource [1] does not
>>>> use KafkaSourceInitialOffsetWriter (of KafkaMicroBatchStream) [2] for
>>>> initial offsets.
>>>>
>>>> Any reason for that? Should I report an issue? Just checking out as I'm
>>>> with 2.4.3 exclusively and have no idea what's coming for 3.0.
>>>>
>>>> [1]
>>>> https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala#L102
>>>>
>>>> [2]
>>>> https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala#L281
>>>>
>>>> Pozdrawiam,
>>>> Jacek Laskowski
>>>> ----
>>>> https://about.me/JacekLaskowski
>>>> The Internals of Spark SQL https://bit.ly/spark-sql-internals
>>>> The Internals of Spark Structured Streaming
>>>> https://bit.ly/spark-structured-streaming
>>>> The Internals of Apache Kafka https://bit.ly/apache-kafka-internals
>>>> Follow me at https://twitter.com/jaceklaskowski
>>>>
>>>>
>>>
>>> --
>>> Name : Jungtaek Lim
>>> Blog : http://medium.com/@heartsavior
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>
>>
>
> --
> Name : Jungtaek Lim
> Blog : http://medium.com/@heartsavior
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>

Mime
View raw message