spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Outstanding Spark 2.1.1 issues
Date Mon, 20 Mar 2017 22:12:35 GMT
Hi Spark Developers!

As we start working on the Spark 2.1.1 release I've been looking at our
outstanding issues still targeted for it. I've tried to break it down by
component so that people in charge of each component can take a quick look
and see if any of these things can/should be re-targeted to 2.2 or 2.1.2 &
the overall list is pretty short (only 9 items - 5 if we only look at
explicitly tagged) :)

If your working on something for Spark 2.1.1 and it doesn't show up in this
list please speak up now :) We have a lot of issues (including "in
progress") that are listed as impacting 2.1.0, but they aren't targeted for
2.1.1 - if there is something you are working in their which should be
targeted for 2.1.1 please let us know so it doesn't slip through the cracks.

The query string I used for looking at the 2.1.1 open issues is:

((affectedVersion = 2.1.1 AND cf[12310320] is Empty) OR fixVersion = 2.1.1
OR cf[12310320] = "2.1.1") AND project = spark AND resolution = Unresolved
ORDER BY priority DESC

None of the open issues appear to be a regression from 2.1.0, but those
seem more likely to show up during the RC process (thanks in advance to
everyone testing their workloads :)) & generally none of them seem to be

(Note: the cfs are for Target Version/s field)

Critical Issues:
 SQL:
  SPARK-19690 <https://issues.apache.org/jira/browse/SPARK-19690> - Join a
streaming DataFrame with a batch DataFrame may not work - PR
https://github.com/apache/spark/pull/17052 (review in progress by zsxwing,
currently failing Jenkins)*

Major Issues:
 SQL:
  SPARK-19035 <https://issues.apache.org/jira/browse/SPARK-19035> - rand()
function in case when cause failed - no outstanding PR (consensus on JIRA
seems to be leaning towards it being a real issue but not necessarily
everyone agrees just yet - maybe we should slip this?)*
 Deploy:
  SPARK-19522 <https://issues.apache.org/jira/browse/SPARK-19522>
 - --executor-memory flag doesn't work in local-cluster mode -
https://github.com/apache/spark/pull/16975 (review in progress by vanzin,
but PR currently stalled waiting on response) *
 Core:
  SPARK-20025 <https://issues.apache.org/jira/browse/SPARK-20025> - Driver
fail over will not work, if SPARK_LOCAL* env is set. -
https://github.com/apache/spark/pull/17357 (waiting on review) *
 PySpark:
 SPARK-19955 <https://issues.apache.org/jira/browse/SPARK-19955> - Update
run-tests to support conda [ Part of Dropping 2.6 support -- which we
shouldn't do in a minor release -- but also fixes pip installability tests
to run in Jenkins ]-  PR failing Jenkins (I need to poke this some more,
but seems like 2.7 support works but some other issues. Maybe slip to 2.2?)

Minor issues:
 Tests:
  SPARK-19612 <https://issues.apache.org/jira/browse/SPARK-19612> - Tests
failing with timeout - No PR per-se but it seems unrelated to the 2.1.1
release. It's not targetted for 2.1.1 but listed as affecting 2.1.1 - I'd
consider explicitly targeting this for 2.2?
 PySpark:
  SPARK-19570 <https://issues.apache.org/jira/browse/SPARK-19570> - Allow
to disable hive in pyspark shell - https://github.com/apache/
spark/pull/16906 PR exists but its difficult to add automated tests for
this (although if SPARK-19955
<https://issues.apache.org/jira/browse/SPARK-19955> gets in would make
testing this easier) - no reviewers yet. Possible re-target?*
 Structured Streaming:
  SPARK-19613 <https://issues.apache.org/jira/browse/SPARK-19613> - Flaky
test: StateStoreRDDSuite.versioning and immutability - It's not targetted
for 2.1.1 but listed as affecting 2.1.1 - I'd consider explicitly targeting
this for 2.2?
 ML:
  SPARK-19759 <https://issues.apache.org/jira/browse/SPARK-19759>
 - ALSModel.predict on Dataframes : potential optimization by not using
blas - No PR consider re-targeting unless someone has a PR waiting in the
wings?

Explicitly targeted issues are marked with a *, the remaining issues are
listed as impacting 2.1.1 and don't have a specific target version set.

Since 2.1.1 continues the 2.1.0 branch, looking at 2.1.0 shows 1 open
blocker in SQL( SPARK-19983
<https://issues.apache.org/jira/browse/SPARK-19983> ),

Query string is:

affectedVersion = 2.1.0 AND cf[12310320] is EMPTY AND project = spark AND
resolution = Unresolved AND priority = targetPriority

Continuing on for unresolved 2.1.0 issues in Major there are 163 (76 of
them in progress), 65 Minor (26 in progress), and 9 trivial (6 in progress).

I'll be going through the 2.1.0 major issues with open PRs that impact the
PySpark component and seeing if any of them should be targeted for 2.1.1,
if anyone from the other components wants to take a look through we might
find some easy wins to be merged.

Cheers,

Holden :)

-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message