spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: Outstanding Spark 2.1.1 issues
Date Tue, 21 Mar 2017 03:18:55 GMT
I've been scrubbing R and think we are tracking 2 issues


https://issues.apache.org/jira/browse/SPARK-19237


https://issues.apache.org/jira/browse/SPARK-19925



________________________________
From: holden.karau@gmail.com <holden.karau@gmail.com> on behalf of Holden Karau <holden@pigscanfly.ca>
Sent: Monday, March 20, 2017 3:12:35 PM
To: dev@spark.apache.org
Subject: Outstanding Spark 2.1.1 issues

Hi Spark Developers!

As we start working on the Spark 2.1.1 release I've been looking at our outstanding issues
still targeted for it. I've tried to break it down by component so that people in charge of
each component can take a quick look and see if any of these things can/should be re-targeted
to 2.2 or 2.1.2 & the overall list is pretty short (only 9 items - 5 if we only look at
explicitly tagged) :)

If your working on something for Spark 2.1.1 and it doesn't show up in this list please speak
up now :) We have a lot of issues (including "in progress") that are listed as impacting 2.1.0,
but they aren't targeted for 2.1.1 - if there is something you are working in their which
should be targeted for 2.1.1 please let us know so it doesn't slip through the cracks.

The query string I used for looking at the 2.1.1 open issues is:

((affectedVersion = 2.1.1 AND cf[12310320] is Empty) OR fixVersion = 2.1.1 OR cf[12310320]
= "2.1.1") AND project = spark AND resolution = Unresolved ORDER BY priority DESC

None of the open issues appear to be a regression from 2.1.0, but those seem more likely to
show up during the RC process (thanks in advance to everyone testing their workloads :)) &
generally none of them seem to be

(Note: the cfs are for Target Version/s field)

Critical Issues:
 SQL:
  SPARK-19690<https://issues.apache.org/jira/browse/SPARK-19690> - Join a streaming
DataFrame with a batch DataFrame may not work - PR https://github.com/apache/spark/pull/17052
(review in progress by zsxwing, currently failing Jenkins)*

Major Issues:
 SQL:
  SPARK-19035<https://issues.apache.org/jira/browse/SPARK-19035> - rand() function in
case when cause failed - no outstanding PR (consensus on JIRA seems to be leaning towards
it being a real issue but not necessarily everyone agrees just yet - maybe we should slip
this?)*
 Deploy:
  SPARK-19522<https://issues.apache.org/jira/browse/SPARK-19522> - --executor-memory
flag doesn't work in local-cluster mode - https://github.com/apache/spark/pull/16975 (review
in progress by vanzin, but PR currently stalled waiting on response) *
 Core:
  SPARK-20025<https://issues.apache.org/jira/browse/SPARK-20025> - Driver fail over
will not work, if SPARK_LOCAL* env is set. - https://github.com/apache/spark/pull/17357 (waiting
on review) *
 PySpark:
 SPARK-19955<https://issues.apache.org/jira/browse/SPARK-19955> - Update run-tests to
support conda [ Part of Dropping 2.6 support -- which we shouldn't do in a minor release --
but also fixes pip installability tests to run in Jenkins ]-  PR failing Jenkins (I need to
poke this some more, but seems like 2.7 support works but some other issues. Maybe slip to
2.2?)

Minor issues:
 Tests:
  SPARK-19612<https://issues.apache.org/jira/browse/SPARK-19612> - Tests failing with
timeout - No PR per-se but it seems unrelated to the 2.1.1 release. It's not targetted for
2.1.1 but listed as affecting 2.1.1 - I'd consider explicitly targeting this for 2.2?
 PySpark:
  SPARK-19570<https://issues.apache.org/jira/browse/SPARK-19570> - Allow to disable
hive in pyspark shell - https://github.com/apache/spark/pull/16906 PR exists but its difficult
to add automated tests for this (although if SPARK-19955<https://issues.apache.org/jira/browse/SPARK-19955>
gets in would make testing this easier) - no reviewers yet. Possible re-target?*
 Structured Streaming:
  SPARK-19613<https://issues.apache.org/jira/browse/SPARK-19613> - Flaky test: StateStoreRDDSuite.versioning
and immutability - It's not targetted for 2.1.1 but listed as affecting 2.1.1 - I'd consider
explicitly targeting this for 2.2?
 ML:
  SPARK-19759<https://issues.apache.org/jira/browse/SPARK-19759> - ALSModel.predict
on Dataframes : potential optimization by not using blas - No PR consider re-targeting unless
someone has a PR waiting in the wings?

Explicitly targeted issues are marked with a *, the remaining issues are listed as impacting
2.1.1 and don't have a specific target version set.

Since 2.1.1 continues the 2.1.0 branch, looking at 2.1.0 shows 1 open blocker in SQL( SPARK-19983<https://issues.apache.org/jira/browse/SPARK-19983>
),

Query string is:

affectedVersion = 2.1.0 AND cf[12310320] is EMPTY AND project = spark AND resolution = Unresolved
AND priority = targetPriority

Continuing on for unresolved 2.1.0 issues in Major there are 163 (76 of them in progress),
65 Minor (26 in progress), and 9 trivial (6 in progress).

I'll be going through the 2.1.0 major issues with open PRs that impact the PySpark component
and seeing if any of them should be targeted for 2.1.1, if anyone from the other components
wants to take a look through we might find some easy wins to be merged.

Cheers,

Holden :)

--
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message