spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabor Somogyi <>
Subject Re: Spark Structured Streaming Custom Sources confusion
Date Fri, 28 Jun 2019 11:06:09 GMT
Hi Lars,

Since Structured Streaming doesn't support receivers at all so that
source/sink can't be used.

Data source v2 is under development and because of that it's a moving
target so I suggest to implement it with v1 (unless special features are
required from v2).
Additionally since I've just adopted Kafka batch source/sink I can say it's
doable to merge from v1 to v2 when time comes.
(Please see Worth to mention
this is batch and not streaming but there is a similar PR)
Dropping v1 will not happen lightning fast in the near future though...


On Tue, Jun 25, 2019 at 10:02 PM Lars Francke <>

> Hi,
> I'm a bit confused about the current state and the future plans of custom
> data sources in Structured Streaming.
> So for DStreams we could write a Receiver as documented. Can this be used
> with Structured Streaming?
> Then we had the DataSource API with DefaultSource et. al. which was (in my
> opinion) never properly documented.
> With Spark 2.3 we got a new DataSourceV2 (which also was a marker
> interface), also not properly documented.
> Now with Spark 3 this seems to change again? (
>, at least the
> DataSourceV2 interface is gone, still no documentation but still called v2
> somehow?
> Can anyone shed some light on the current state of data sources & sinks
> for batch & streaming in Spark 2.4 and 3.x?
> Thank you!
> Cheers,
> Lars

View raw message