spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <>
Subject Re: Spark 2.0
Date Mon, 25 Jul 2016 20:57:35 GMT
Hi Bryan,

Excellent questions about the upcoming 2.0! Took me a while to find
the answer about structured streaming.

? That may be relevant to your question 2.

Jacek Laskowski
Mastering Apache Spark
Follow me at

On Mon, Jul 25, 2016 at 8:23 PM, Bryan Jeffrey <> wrote:
> All,
> I had three questions:
> (1) Is there a timeline for stable Spark 2.0 release?  I know the 'preview'
> build is out there, but was curious what the timeline was for full release.
> Jira seems to indicate that there should be a release 7/27.
> (2)  For 'continuous' datasets there has been a lot of discussion. One item
> that came up in tickets was the idea that 'count()' and other functions do
> not apply to continuous datasets:
>  In this case what is the
> intended procedure to calculate a streaming statistic based on an interval
> (e.g. count the number of records in a 2 minute window every 2 minutes)?
> (3) In previous releases (1.6.1) the call to DStream / RDD repartition w/ a
> number of partitions set to zero silently deletes data.  I have looked in
> Jira for a similar issue, but I do not see one.  I would like to address
> this (and would likely be willing to go fix it myself).  Should I just
> create a ticket?
> Thank you,
> Bryan Jeffrey

To unsubscribe e-mail:

View raw message