+1 (non-binding) - checked Python artifacts with virtual env.

On Sun, Dec 18, 2016 at 11:42 AM Denny Lee <denny.g.lee@gmail.com> wrote:
+1 (non-binding)

On Sat, Dec 17, 2016 at 11:45 PM Liwei Lin <lwlin7@gmail.com> wrote:


On Sat, Dec 17, 2016 at 10:29 AM, Yuming Wang <wgyumg@gmail.com> wrote:
I hope https://github.com/apache/spark/pull/16252 can be fixed until release 2.1.0. It's a fix for broadcast cannot fit in memory.

On Sat, Dec 17, 2016 at 10:23 AM, Joseph Bradley <joseph@databricks.com> wrote:

On Fri, Dec 16, 2016 at 3:21 PM, Herman van Hövell tot Westerflier <hvanhovell@databricks.com> wrote:

On Sat, Dec 17, 2016 at 12:14 AM, Xiao Li <gatorsmile@gmail.com> wrote:

Xiao Li

2016-12-16 12:19 GMT-08:00 Felix Cheung <felixcheung_m@hotmail.com>:

For R we have a license field in the DESCRIPTION, and this is standard practice (and requirement) for R packages.

From: Sean Owen <sowen@cloudera.com>

Sent: Friday, December 16, 2016 9:57:15 AM

To: Reynold Xin; dev@spark.apache.org

Subject: Re: [VOTE] Apache Spark 2.1.0 (RC5)


(If you have a template for these emails, maybe update it to use https links. They work for

apache.org domains. After all we are asking people to verify the integrity of release artifacts, so it might as well be secure.)

(Also the new archives use .tar.gz instead of .tgz like the others. No big deal, my OCD eye just noticed it.)

I don't see an Apache license / notice for the Pyspark or SparkR artifacts. It would be good practice to include this in a convenience binary. I'm not sure if it's strictly mandatory, but something to adjust in any event. I think that's all there is to

do for SparkR. For Pyspark, which packages a bunch of dependencies, it does include the licenses (good) but I think it should include the NOTICE file.

This is the first time I recall getting 0 test failures off the bat!

I'm using Java 8 / Ubuntu 16 and yarn/hive/hadoop-2.7 profiles.

I think I'd +1 this therefore unless someone knows that the license issue above is real and a blocker.

On Fri, Dec 16, 2016 at 5:17 AM Reynold Xin <rxin@databricks.com> wrote:

Please vote on releasing the following candidate as Apache Spark version 2.1.0. The vote is open until Sun, December 18, 2016 at 21:30 PT and passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.1.0

[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see


The tag to be voted on is v2.1.0-rc5 (cd0a08361e2526519e7c131c42116bf56fa62c76)

The release files, including signatures, digests, etc. can be found at:

Release artifacts are signed with the following key:

The staging repository for this release can be found at:

The documentation corresponding to this release can be found at:


How can I help test this release?

If you are a Spark user, you can help us test this release by taking an existing Spark workload and running on this release candidate, then reporting any regressions.

What should happen to JIRA tickets still targeting 2.1.0?

Committers should look at those and triage. Extremely important bug fixes, documentation, and API tweaks that impact compatibility should be worked on immediately. Everything else please retarget to 2.1.1 or 2.2.0.

What happened to RC3/RC5?

They had issues withe release packaging and as a result were skipped.


Herman van Hövell

Software Engineer

Databricks Inc.


+31 6 420 590 27




Joseph Bradley

Software Engineer - Machine Learning

Databricks, Inc.