spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saisai Shao <>
Subject Re: Kafka 0.10 with PySpark
Date Wed, 05 Jul 2017 07:26:58 GMT
Please see the reason in this thread ( It would better to use
structured streaming instead.

So I would like to -1 this patch. I think it's been a mistake to support
> dstream in Python -- yes it satisfies a checkbox and Spark could claim
> there's support for streaming in Python. However, the tooling and maturity
> for working with streaming data (both in Spark and the more broad
> ecosystem) is simply not there. It is a big baggage to maintain, and
> creates a the wrong impression that production streaming jobs can be
> written in Python.

On Tue, Jul 4, 2017 at 10:53 PM, Daniel van der Ende <> wrote:

> Hi,
> I'm working on integrating some pyspark code with Kafka. We'd like to use
> SSL/TLS, and so want to use Kafka 0.10. Because structured streaming is
> still marked alpha, we'd like to use Spark streaming. On this page,
> however, it indicates that the Kafka 0.10 integration in Spark does not
> support Python (
> integration.html). I've been trying to figure out why, but have not been
> able to find anything. Is there any particular reason for this?
> Thanks,
> Daniel

View raw message