Thanks for sharing the blockers, Wenchen. SPARK-31404 has sub-tasks, hence that means all sub-tasks are blockers for this release, do I understand that correctly?

Xiao, I sincerely respect the practice the Spark community has been done, so please treat it as 2 cents. Just would like to see the way how the community could focus on the such huge release - even only counting bugs + improvement + new features, nearly 2000 issues has been resolved "only" in Spark 3.0.0. The volume seems to be quite different from usual bugfix and minor releases which feels that special cares are needed.


On Fri, Apr 10, 2020 at 1:22 PM Wenchen Fan <cloud0fan@gmail.com> wrote:
The ongoing critical issues I'm aware of are:
SPARK-31257: Fix ambiguous two different CREATE TABLE syntaxes
SPARK-31404: backward compatibility issues after switching to Proleptic Gregorian calendar
SPARK-31399: closure cleaner is broken in Spark 3.0
SPARK-28067: Incorrect results in decimal aggregation with whole-stage codegen enabled

That said, I'm -1 (binding) to RC1

Please reply to this thread if you know more critical issues that should be fixed before 3.0.

Thanks,
Wenchen


On Fri, Apr 10, 2020 at 10:01 AM Xiao Li <lixiao@databricks.com> wrote:
Only the low-risk or high-value bug fixes, and the documentation changes are allowed to merge to branch-3.0. I expect all the committers are following the same rules like what we did in the previous releases. 

Xiao

On Thu, Apr 9, 2020 at 6:13 PM Jungtaek Lim <kabhwan.opensource@gmail.com> wrote:
Looks like around 80 commits have been landed to branch-3.0 after we cut RC1 (I know many of them are to version the config, as well as add docs). Shall we announce the blocker-only phase and maintain the list of blockers to restrict the changes on the branch? This would make everyone being hesitate to test the RC1 (see how many people have been tested RC1 in this thread), as they probably need to test the same with RC2.

On Thu, Apr 9, 2020 at 5:50 PM Jungtaek Lim <kabhwan.opensource@gmail.com> wrote:
I went through some manually tests for the new features of Structured Streaming in Spark 3.0.0. (Please let me know if there're more features we'd like to test manually.)

* file source cleanup - both “archive" and “delete" work. Query fails as expected when the input directory is the output directory of file sink.
* kafka source/sink - “header” works for both source and sink, "group id prefix" and “static group id” work, confirmed start offset by timestamp works for streaming case
* event log stuffs with streaming query - enabled it, confirmed compaction works, and SHS can read compacted event logs, and downloading event log in SHS works as zipping the event log directory. original functionalities with single event log file work as well.

Looks good, though there're still plenty of commits pushed to branch-3.0 after RC1 which feels me that it may not be safe to carry over the test result for RC1 to RC2.

On Sat, Apr 4, 2020 at 12:49 AM Sean Owen <srowen@apache.org> wrote:
Aside from the other issues mentioned here, which probably do require
another RC, this looks pretty good to me.

I built on Ubuntu 19 and ran with Java 11, -Pspark-ganglia-lgpl
-Pkinesis-asl -Phadoop-3.2 -Phive-2.3 -Pyarn -Pmesos -Pkubernetes
-Phive-thriftserver -Djava.version=11

I did see the following test failures, but as usual, I'm not sure
whether it's specific to me. Anyone else see these, particularly the R
warnings?


PythonUDFSuite:
org.apache.spark.sql.execution.python.PythonUDFSuite *** ABORTED ***
  java.lang.RuntimeException: Unable to load a Suite class that was
discovered in the runpath:
org.apache.spark.sql.execution.python.PythonUDFSuite
  at org.scalatest.tools.DiscoverySuite$.getSuiteInstance(DiscoverySuite.scala:81)
  at org.scalatest.tools.DiscoverySuite.$anonfun$nestedSuites$1(DiscoverySuite.scala:38)
  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
  at scala.collection.Iterator.foreach(Iterator.scala:941)
  at scala.collection.Iterator.foreach$(Iterator.scala:941)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
  at scala.collection.IterableLike.foreach(IterableLike.scala:74)
  at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
  at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
  at scala.collection.TraversableLike.map(TraversableLike.scala:238)


- SPARK-25158: Executor accidentally exit because
ScriptTransformationWriterThread throw Exception *** FAILED ***
  Expected exception org.apache.spark.SparkException to be thrown, but
no exception was thrown (SQLQuerySuite.scala:2384)


* checking for missing documentation entries ... WARNING
Undocumented code objects:
  ‘%<=>%’ ‘add_months’ ‘agg’ ‘approxCountDistinct’ ‘approxQuantile’
  ‘approx_count_distinct’ ‘arrange’ ‘array_contains’ ‘array_distinct’
...
 WARNING
‘qpdf’ is needed for checks on size reduction of PDFs

On Tue, Mar 31, 2020 at 10:04 PM Reynold Xin <rxin@databricks.com> wrote:
>
> Please vote on releasing the following candidate as Apache Spark version 3.0.0.
>
> The vote is open until 11:59pm Pacific time Fri Apr 3, and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.0.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.0.0-rc1 (commit 6550d0d5283efdbbd838f3aeaf0476c7f52a0fb1):
> https://github.com/apache/spark/tree/v3.0.0-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1341/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc1-docs/
>
> The list of bug fixes going into 2.4.5 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12339177
>
> This release is using the release script of the tag v3.0.0-rc1.
>
>
> FAQ
>
> =========================
> How can I help test this release?
> =========================
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===========================================
> What should happen to JIRA tickets still targeting 3.0.0?
> ===========================================
> The current list of open tickets targeted at 3.0.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.0.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==================
> But my bug isn't fixed?
> ==================
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>
> Note: I fully expect this RC to fail.
>
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org



--