spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Rosen <rosenvi...@gmail.com>
Subject Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
Date Sat, 13 Dec 2014 04:00:38 GMT
+1.  Tested using spark-perf and the Spark EC2 scripts.  I didn’t notice any performance
regressions that could not be attributed to changes of default configurations.  To be more
specific, when running Spark 1.2.0 with the Spark 1.1.0 settings of spark.shuffle.manager=hash
and spark.shuffle.blockTransferService=nio, there was no performance regression and, in fact,
there were significant performance improvements for some workloads.

In Spark 1.2.0, the new default settings are spark.shuffle.manager=sort and spark.shuffle.blockTransferService=netty.
 With these new settings, I noticed a performance regression in the scala-sort-by-key-int
spark-perf test.  However, Spark 1.1.0 and 1.1.1 exhibit a similar performance regression
for that same test when run with spark.shuffle.manager=sort, so this regression seems explainable
by the change of defaults.  Besides this, most of the other tests ran at the same speeds
or faster with the new 1.2.0 defaults.  Also, keep in mind that this is a somewhat artificial
micro benchmark; I have heard anecdotal reports from many users that their real workloads
have run faster with 1.2.0.

Based on these results, I’m comfortable giving a +1 on 1.2.0 RC2.

- Josh

On December 11, 2014 at 9:52:39 AM, Sandy Ryza (sandy.ryza@cloudera.com) wrote:

+1 (non-binding). Tested on Ubuntu against YARN.  

On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin <rxin@databricks.com> wrote:  

> +1  
>  
> Tested on OS X.  
>  
> On Wednesday, December 10, 2014, Patrick Wendell <pwendell@gmail.com>  
> wrote:  
>  
> > Please vote on releasing the following candidate as Apache Spark version  
> > 1.2.0!  
> >  
> > The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):  
> >  
> >  
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e
 
> >  
> > The release files, including signatures, digests, etc. can be found at:  
> > http://people.apache.org/~pwendell/spark-1.2.0-rc2/  
> >  
> > Release artifacts are signed with the following key:  
> > https://people.apache.org/keys/committer/pwendell.asc  
> >  
> > The staging repository for this release can be found at:  
> > https://repository.apache.org/content/repositories/orgapachespark-1055/  
> >  
> > The documentation corresponding to this release can be found at:  
> > http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/  
> >  
> > Please vote on releasing this package as Apache Spark 1.2.0!  
> >  
> > The vote is open until Saturday, December 13, at 21:00 UTC and passes  
> > if a majority of at least 3 +1 PMC votes are cast.  
> >  
> > [ ] +1 Release this package as Apache Spark 1.2.0  
> > [ ] -1 Do not release this package because ...  
> >  
> > To learn more about Apache Spark, please see  
> > http://spark.apache.org/  
> >  
> > == What justifies a -1 vote for this release? ==  
> > This vote is happening relatively late into the QA period, so  
> > -1 votes should only occur for significant regressions from  
> > 1.0.2. Bugs already present in 1.1.X, minor  
> > regressions, or bugs related to new features will not block this  
> > release.  
> >  
> > == What default changes should I be aware of? ==  
> > 1. The default value of "spark.shuffle.blockTransferService" has been  
> > changed to "netty"  
> > --> Old behavior can be restored by switching to "nio"  
> >  
> > 2. The default value of "spark.shuffle.manager" has been changed to  
> "sort".  
> > --> Old behavior can be restored by setting "spark.shuffle.manager" to  
> > "hash".  
> >  
> > == How does this differ from RC1 ==  
> > This has fixes for a handful of issues identified - some of the  
> > notable fixes are:  
> >  
> > [Core]  
> > SPARK-4498: Standalone Master can fail to recognize completed/failed  
> > applications  
> >  
> > [SQL]  
> > SPARK-4552: Query for empty parquet table in spark sql hive get  
> > IllegalArgumentException  
> > SPARK-4753: Parquet2 does not prune based on OR filters on partition  
> > columns  
> > SPARK-4761: With JDBC server, set Kryo as default serializer and  
> > disable reference tracking  
> > SPARK-4785: When called with arguments referring column fields, PMOD  
> > throws NPE  
> >  
> > - Patrick  
> >  
> > ---------------------------------------------------------------------  
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org <javascript:;>  
> > For additional commands, e-mail: dev-help@spark.apache.org  
> <javascript:;>  
> >  
> >  
>  

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message