From dev-return-13320-apmail-spark-dev-archive=spark.apache.org@spark.apache.org Fri May 22 21:20:07 2015 Return-Path: X-Original-To: apmail-spark-dev-archive@minotaur.apache.org Delivered-To: apmail-spark-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1E283185C8 for ; Fri, 22 May 2015 21:20:07 +0000 (UTC) Received: (qmail 47922 invoked by uid 500); 22 May 2015 21:20:05 -0000 Delivered-To: apmail-spark-dev-archive@spark.apache.org Received: (qmail 47835 invoked by uid 500); 22 May 2015 21:20:05 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spark.apache.org Received: (qmail 47825 invoked by uid 99); 22 May 2015 21:20:05 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 May 2015 21:20:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id F0D8DC0044 for ; Fri, 22 May 2015 21:20:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.001 X-Spam-Level: *** X-Spam-Status: No, score=3.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 2DHSF2gwn6BA for ; Fri, 22 May 2015 21:19:55 +0000 (UTC) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 07B9C20BA8 for ; Fri, 22 May 2015 21:19:55 +0000 (UTC) Received: by lbbqq2 with SMTP id qq2so20879829lbb.3 for ; Fri, 22 May 2015 14:19:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=wZukeMMjFt37P39FXMIGiPrX0X9hKccYzXL3MoZcAAk=; b=Wjw3sAq9ZLAZe7DD60ajREtJbEotq88QWWiQOnQw2jg9vodDKT5ZaACVT+Nfsi29Ut 0XdCMgSbNk11boemiszV6NJuzbxbtC9XCRQDG4vdxY3Q7ukaKlyJAms7HCKOO3/tAdsX qvysqWie0tYSnIY4G4F1BCrJkVHqp3uolViPvFsHCzQR28EgktsZGedjWSjzuoPWV0Wi HElUTP2Vhaw2pFLwClDiGNbEVaB7unLxGpNHB1+320y6dtYVaKasIg0sW1lRqLgPwUI4 E1cfG96VEhRcl0xmAgMCEZ+9Bi+tdhH7HB1lL/n7174Ilkk3zuaDLni7TJOjicmvU7/2 5hLA== X-Gm-Message-State: ALoCoQmrUV6yc261/Jn+7nmVxdX92MvlGcLM1hT5P1emo5F1YsCftza82BVCAA8EZKb8ej12pFl1 X-Received: by 10.112.168.165 with SMTP id zx5mr7910109lbb.111.1432329594435; Fri, 22 May 2015 14:19:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.216.84 with HTTP; Fri, 22 May 2015 14:19:33 -0700 (PDT) In-Reply-To: References: From: Michael Armbrust Date: Fri, 22 May 2015 14:19:33 -0700 Message-ID: Subject: Re: [VOTE] Release Apache Spark 1.4.0 (RC1) To: Justin Uang Cc: Imran Rashid , Patrick Wendell , "dev@spark.apache.org" Content-Type: multipart/alternative; boundary=001a11c23452ad8e720516b2387d --001a11c23452ad8e720516b2387d Content-Type: text/plain; charset=UTF-8 Thanks for the feedback. As you stated UDTs are explicitly not a public api as we knew we were going to be making breaking changes to them. We hope to stabilize / open them up in future releases. Regarding the Hive issue, have you tried using TestHive instead. This is what we use for testing and it takes care of creating temporary directories for all storage. It also has a reset() function that you can call in-between tests. If this doesn't work for you, maybe open a JIRA and we can discuss more there. On Fri, May 22, 2015 at 12:56 PM, Justin Uang wrote: > I'm working on one of the Palantir teams using Spark, and here is our > feedback: > > We have encountered three issues when upgrading to spark 1.4.0. I'm not > sure they qualify as a -1, as they come from using non-public APIs and > multiple spark contexts for the purposes of testing, but I do want to bring > them up for awareness =) > > 1. Our UDT was serializing to a StringType, but now strings are > represented internally as UTF8String, so we had to change our UDT to use > UTF8String.apply() and UTF8String.toString() to convert back to String. > 2. createDataFrame when using UDTs used to accept things in the > serialized catalyst form. Now, they're supposed to be in the UDT java class > form (I think this change would've affected us in 1.3.1 already, since we > were in 1.3.0) > 3. derby database lifecycle management issue with HiveContext. We have > been using a SparkContextResource JUnit Rule that we wrote, and it sets up > then tears down a SparkContext and HiveContext between unit test runs > within the same process (possibly the same thread as well). Multiple > contexts are not being used at once. It used to work in 1.3.0, but now when > we try to create the HiveContext for the second unit test, then it > complains with the following exception. I have a feeling it might have > something to do with the Hive object being thread local, and us not > explicitly closing the HiveContext and everything it holds. The full stack > trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd > > Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@5dea2446, see the next exception for details. > at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > > > On Wed, May 20, 2015 at 10:35 AM Imran Rashid > wrote: > >> -1 >> >> discovered I accidentally removed master & worker json endpoints, will >> restore >> https://issues.apache.org/jira/browse/SPARK-7760 >> >> On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell >> wrote: >> >>> Please vote on releasing the following candidate as Apache Spark version >>> 1.4.0! >>> >>> The tag to be voted on is v1.4.0-rc1 (commit 777a081): >>> >>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e >>> >>> The release files, including signatures, digests, etc. can be found at: >>> http://people.apache.org/~pwendell/spark-1.4.0-rc1/ >>> >>> Release artifacts are signed with the following key: >>> https://people.apache.org/keys/committer/pwendell.asc >>> >>> The staging repository for this release can be found at: >>> https://repository.apache.org/content/repositories/orgapachespark-1092/ >>> >>> The documentation corresponding to this release can be found at: >>> http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ >>> >>> Please vote on releasing this package as Apache Spark 1.4.0! >>> >>> The vote is open until Friday, May 22, at 17:03 UTC and passes >>> if a majority of at least 3 +1 PMC votes are cast. >>> >>> [ ] +1 Release this package as Apache Spark 1.4.0 >>> [ ] -1 Do not release this package because ... >>> >>> To learn more about Apache Spark, please see >>> http://spark.apache.org/ >>> >>> == How can I help test this release? == >>> If you are a Spark user, you can help us test this release by >>> taking a Spark 1.3 workload and running on this release candidate, >>> then reporting any regressions. >>> >>> == What justifies a -1 vote for this release? == >>> This vote is happening towards the end of the 1.4 QA period, >>> so -1 votes should only occur for significant regressions from 1.3.1. >>> Bugs already present in 1.3.X, minor regressions, or bugs related >>> to new features will not block this release. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org >>> For additional commands, e-mail: dev-help@spark.apache.org >>> >>> >> --001a11c23452ad8e720516b2387d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks for the feedback.=C2=A0 As you stated UDTs are expl= icitly not a public api as we knew we were going to be making breaking chan= ges to them.=C2=A0 We hope to stabilize / open them up in future releases.= =C2=A0 Regarding the Hive issue, have you tried using TestHive instead.=C2= =A0 This is what we use for testing and it takes care of creating temporary= directories for all storage.=C2=A0 It also has a reset() function that you= can call in-between tests.=C2=A0 If this doesn't work for you, maybe o= pen a JIRA and we can discuss more there.
<= br>
On Fri, May 22, 2015 at 12:56 PM, Justin Uang= <justin.uang@gmail.com> wrote:
= I'm working on one of the Palantir teams using Spark, and here is our f= eedback:

We have encountered three = issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1= , as they come from using non-public APIs and multiple spark contexts for t= he purposes of testing, but I do want to bring them up for awareness =3D)
  1. Our UDT was se= rializing to a StringType, but now strings are represented internally as UT= F8String, so we had to change our UDT to use UTF8String.apply() and UTF8Str= ing.toString() to convert back to String.
  2. createDataFrame w= hen using UDTs used to accept things in the serialized catalyst form. Now, = they're supposed to be in the UDT java class form (I think this change = would've affected us in 1.3.1 already, since we were in 1.3.0)
  3. derby database lifecycle management issue with HiveContext. We have= been using a SparkContextResource JUnit Rule that we wrote, and it sets up= then tears down a SparkContext and HiveContext between unit test runs with= in the same process (possibly the same thread as well). Multiple contexts a= re not being used at once. It used to work in 1.3.0, but now when we try to= create the HiveContext for the second unit test, then it complains with th= e following exception. I have a feeling it might have something to do with = the Hive object being thread local, and us not explicitly closing the HiveC= ontext and everything it holds. The full stack trace is here:=C2=A0<= span style=3D"font-size:13.1999998092651px;line-height:19.7999992370605px">= https://gist.github.com/justinuang/0403d49cdeedf91727cd
Caused by: java.sql.SQLException: Failed to start database =
9;metastore_db' with class loader org.apache.spark.sql.hive.client.Isol=
atedClientLoader$$anon$1@5dea2446, see the next exception for details.
	at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown =
Source)

=
On Wed, May 20, 2015 at 10:35 AM Imran Rashid &l= t;irashid@clouder= a.com> wrote:
-1<= div>
discovered I accidentally removed master & worker json end= points, will restore

On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell <pwend= ell@gmail.com> wrote:
Pleas= e vote on releasing the following candidate as Apache Spark version 1.4.0!<= br>
The tag to be voted on is v1.4.0-rc1 (commit 777a081):
https://gi= t-wip-us.apache.org/repos/asf?p=3Dspark.git;a=3Dcommit;h=3D777a08166f1fb144= 146ba32581d4632c3466541e

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.4.0-rc1/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositori= es/orgapachespark-1092/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/
Please vote on releasing this package as Apache Spark 1.4.0!

The vote is open until Friday, May 22, at 17:03 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.4.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.= org/

=3D=3D How can I help test this release? =3D=3D
If you are a Spark user, you can help us test this release by
taking a Spark 1.3 workload and running on this release candidate,
then reporting any regressions.

=3D=3D What justifies a -1 vote for this release? =3D=3D
This vote is happening towards the end of the 1.4 QA period,
so -1 votes should only occur for significant regressions from 1.3.1.
Bugs already present in 1.3.X, minor regressions, or bugs related
to new features will not block this release.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org



--001a11c23452ad8e720516b2387d--