spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Zhuge <john.zh...@gmail.com>
Subject Re: [VOTE] Spark 2.3.1 (RC4)
Date Mon, 04 Jun 2018 15:06:39 GMT
+1

On Sun, Jun 3, 2018 at 6:12 PM, Hyukjin Kwon <gurwls223@gmail.com> wrote:

> +1
>
> 2018년 6월 3일 (일) 오후 9:25, Ricardo Almeida <ricardo.almeida@actnowib.com>님이
> 작성:
>
>> +1 (non-binding)
>>
>> On 3 June 2018 at 09:23, Dongjoon Hyun <dongjoon.hyun@gmail.com> wrote:
>>
>>> +1
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Sat, Jun 2, 2018 at 8:09 PM, Denny Lee <denny.g.lee@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>> On Sat, Jun 2, 2018 at 4:53 PM Nicholas Chammas <
>>>> nicholas.chammas@gmail.com> wrote:
>>>>
>>>>> I'll give that a try, but I'll still have to figure out what to do if
>>>>> none of the release builds work with hadoop-aws, since Flintrock deploys
>>>>> Spark release builds to set up a cluster. Building Spark is slow, so
we
>>>>> only do it if the user specifically requests a Spark version by git hash.
>>>>> (This is basically how spark-ec2 did things, too.)
>>>>>
>>>>>
>>>>> On Sat, Jun 2, 2018 at 6:54 PM Marcelo Vanzin <vanzin@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> If you're building your own Spark, definitely try the hadoop-cloud
>>>>>> profile. Then you don't even need to pull anything at runtime,
>>>>>> everything is already packaged with Spark.
>>>>>>
>>>>>> On Fri, Jun 1, 2018 at 6:51 PM, Nicholas Chammas
>>>>>> <nicholas.chammas@gmail.com> wrote:
>>>>>> > pyspark --packages org.apache.hadoop:hadoop-aws:2.7.3 didn’t
work
>>>>>> for me
>>>>>> > either (even building with -Phadoop-2.7). I guess I’ve been
relying
>>>>>> on an
>>>>>> > unsupported pattern and will need to figure something else out
>>>>>> going forward
>>>>>> > in order to use s3a://.
>>>>>> >
>>>>>> >
>>>>>> > On Fri, Jun 1, 2018 at 9:09 PM Marcelo Vanzin <vanzin@cloudera.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> I have personally never tried to include hadoop-aws that
way. But
>>>>>> at
>>>>>> >> the very least, I'd try to use the same version of Hadoop
as the
>>>>>> Spark
>>>>>> >> build (2.7.3 IIRC). I don't really expect a different version
to
>>>>>> work,
>>>>>> >> and if it did in the past it definitely was not by design.
>>>>>> >>
>>>>>> >> On Fri, Jun 1, 2018 at 5:50 PM, Nicholas Chammas
>>>>>> >> <nicholas.chammas@gmail.com> wrote:
>>>>>> >> > Building with -Phadoop-2.7 didn’t help, and if I
remember
>>>>>> correctly,
>>>>>> >> > building with -Phadoop-2.8 worked with hadoop-aws in
the 2.3.0
>>>>>> release,
>>>>>> >> > so
>>>>>> >> > it appears something has changed since then.
>>>>>> >> >
>>>>>> >> > I wasn’t familiar with -Phadoop-cloud, but I can
try that.
>>>>>> >> >
>>>>>> >> > My goal here is simply to confirm that this release
of Spark
>>>>>> works with
>>>>>> >> > hadoop-aws like past releases did, particularly for
Flintrock
>>>>>> users who
>>>>>> >> > use
>>>>>> >> > Spark with S3A.
>>>>>> >> >
>>>>>> >> > We currently provide -hadoop2.6, -hadoop2.7, and -without-hadoop
>>>>>> builds
>>>>>> >> > with
>>>>>> >> > every Spark release. If the -hadoop2.7 release build
won’t work
>>>>>> with
>>>>>> >> > hadoop-aws anymore, are there plans to provide a new
build type
>>>>>> that
>>>>>> >> > will?
>>>>>> >> >
>>>>>> >> > Apologies if the question is poorly formed. I’m batting
a bit
>>>>>> outside my
>>>>>> >> > league here. Again, my goal is simply to confirm that
I/my users
>>>>>> still
>>>>>> >> > have
>>>>>> >> > a way to use s3a://. In the past, that way was simply
to call
>>>>>> pyspark
>>>>>> >> > --packages org.apache.hadoop:hadoop-aws:2.8.4 or something
very
>>>>>> similar.
>>>>>> >> > If
>>>>>> >> > that will no longer work, I’m trying to confirm that
the change
>>>>>> of
>>>>>> >> > behavior
>>>>>> >> > is intentional or acceptable (as a review for the Spark
project)
>>>>>> and
>>>>>> >> > figure
>>>>>> >> > out what I need to change (as due diligence for Flintrock’s
>>>>>> users).
>>>>>> >> >
>>>>>> >> > Nick
>>>>>> >> >
>>>>>> >> >
>>>>>> >> > On Fri, Jun 1, 2018 at 8:21 PM Marcelo Vanzin <
>>>>>> vanzin@cloudera.com>
>>>>>> >> > wrote:
>>>>>> >> >>
>>>>>> >> >> Using the hadoop-aws package is probably going
to be a little
>>>>>> more
>>>>>> >> >> complicated than that. The best bet is to use a
custom build of
>>>>>> Spark
>>>>>> >> >> that includes it (use -Phadoop-cloud). Otherwise
you're probably
>>>>>> >> >> looking at some nasty dependency issues, especially
if you end
>>>>>> up
>>>>>> >> >> mixing different versions of Hadoop.
>>>>>> >> >>
>>>>>> >> >> On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
>>>>>> >> >> <nicholas.chammas@gmail.com> wrote:
>>>>>> >> >> > I was able to successfully launch a Spark
cluster on EC2 at
>>>>>> 2.3.1 RC4
>>>>>> >> >> > using
>>>>>> >> >> > Flintrock. However, trying to load the hadoop-aws
package
>>>>>> gave me
>>>>>> >> >> > some
>>>>>> >> >> > errors.
>>>>>> >> >> >
>>>>>> >> >> > $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
>>>>>> >> >> >
>>>>>> >> >> > <snipped>
>>>>>> >> >> >
>>>>>> >> >> > :: problems summary ::
>>>>>> >> >> > :::: WARNINGS
>>>>>> >> >> >                 [NOT FOUND  ]
>>>>>> >> >> > com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle)
(2ms)
>>>>>> >> >> >         ==== local-m2-cache: tried
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> > file:/home/ec2-user/.m2/repository/com/sun/jersey/
>>>>>> jersey-json/1.9/jersey-json-1.9.jar
>>>>>> >> >> >                 [NOT FOUND  ]
>>>>>> >> >> > com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle)
>>>>>> (0ms)
>>>>>> >> >> >         ==== local-m2-cache: tried
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> > file:/home/ec2-user/.m2/repository/com/sun/jersey/
>>>>>> jersey-server/1.9/jersey-server-1.9.jar
>>>>>> >> >> >                 [NOT FOUND  ]
>>>>>> >> >> > org.codehaus.jettison#jettison;1.1!jettison.jar(bundle)
(1ms)
>>>>>> >> >> >         ==== local-m2-cache: tried
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> > file:/home/ec2-user/.m2/repository/org/codehaus/
>>>>>> jettison/jettison/1.1/jettison-1.1.jar
>>>>>> >> >> >                 [NOT FOUND  ]
>>>>>> >> >> > com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar
(0ms)
>>>>>> >> >> >         ==== local-m2-cache: tried
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> > file:/home/ec2-user/.m2/repository/com/sun/xml/bind/
>>>>>> jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>>>>>> >> >> >
>>>>>> >> >> > I’d guess I’m probably using the wrong
version of hadoop-aws,
>>>>>> but I
>>>>>> >> >> > called
>>>>>> >> >> > make-distribution.sh with -Phadoop-2.8 so
I’m not sure what
>>>>>> else to
>>>>>> >> >> > try.
>>>>>> >> >> >
>>>>>> >> >> > Any quick pointers?
>>>>>> >> >> >
>>>>>> >> >> > Nick
>>>>>> >> >> >
>>>>>> >> >> >
>>>>>> >> >> > On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin
<
>>>>>> vanzin@cloudera.com>
>>>>>> >> >> > wrote:
>>>>>> >> >> >>
>>>>>> >> >> >> Starting with my own +1 (binding).
>>>>>> >> >> >>
>>>>>> >> >> >> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo
Vanzin <
>>>>>> vanzin@cloudera.com>
>>>>>> >> >> >> wrote:
>>>>>> >> >> >> > Please vote on releasing the following
candidate as Apache
>>>>>> Spark
>>>>>> >> >> >> > version
>>>>>> >> >> >> > 2.3.1.
>>>>>> >> >> >> >
>>>>>> >> >> >> > Given that I expect at least a few
people to be busy with
>>>>>> Spark
>>>>>> >> >> >> > Summit
>>>>>> >> >> >> > next
>>>>>> >> >> >> > week, I'm taking the liberty of setting
an extended voting
>>>>>> period.
>>>>>> >> >> >> > The
>>>>>> >> >> >> > vote
>>>>>> >> >> >> > will be open until Friday, June 8th,
at 19:00 UTC (that's
>>>>>> 12:00
>>>>>> >> >> >> > PDT).
>>>>>> >> >> >> >
>>>>>> >> >> >> > It passes with a majority of +1 votes,
which must include
>>>>>> at least
>>>>>> >> >> >> > 3
>>>>>> >> >> >> > +1
>>>>>> >> >> >> > votes
>>>>>> >> >> >> > from the PMC.
>>>>>> >> >> >> >
>>>>>> >> >> >> > [ ] +1 Release this package as Apache
Spark 2.3.1
>>>>>> >> >> >> > [ ] -1 Do not release this package
because ...
>>>>>> >> >> >> >
>>>>>> >> >> >> > To learn more about Apache Spark,
please see
>>>>>> >> >> >> > http://spark.apache.org/
>>>>>> >> >> >> >
>>>>>> >> >> >> > The tag to be voted on is v2.3.1-rc4
(commit 30aaa5a3):
>>>>>> >> >> >> > https://github.com/apache/spark/tree/v2.3.1-rc4
>>>>>> >> >> >> >
>>>>>> >> >> >> > The release files, including signatures,
digests, etc. can
>>>>>> be
>>>>>> >> >> >> > found
>>>>>> >> >> >> > at:
>>>>>> >> >> >> > https://dist.apache.org/repos/
>>>>>> dist/dev/spark/v2.3.1-rc4-bin/
>>>>>> >> >> >> >
>>>>>> >> >> >> > Signatures used for Spark RCs can
be found in this file:
>>>>>> >> >> >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>> >> >> >> >
>>>>>> >> >> >> > The staging repository for this release
can be found at:
>>>>>> >> >> >> >
>>>>>> >> >> >> >
>>>>>> >> >> >> > https://repository.apache.org/content/repositories/
>>>>>> orgapachespark-1272/
>>>>>> >> >> >> >
>>>>>> >> >> >> > The documentation corresponding to
this release can be
>>>>>> found at:
>>>>>> >> >> >> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-
>>>>>> docs/
>>>>>> >> >> >> >
>>>>>> >> >> >> > The list of bug fixes going into
2.3.1 can be found at the
>>>>>> >> >> >> > following
>>>>>> >> >> >> > URL:
>>>>>> >> >> >> > https://issues.apache.org/jira/projects/SPARK/versions/
>>>>>> 12342432
>>>>>> >> >> >> >
>>>>>> >> >> >> > FAQ
>>>>>> >> >> >> >
>>>>>> >> >> >> > =========================
>>>>>> >> >> >> > How can I help test this release?
>>>>>> >> >> >> > =========================
>>>>>> >> >> >> >
>>>>>> >> >> >> > If you are a Spark user, you can
help us test this release
>>>>>> by
>>>>>> >> >> >> > taking
>>>>>> >> >> >> > an existing Spark workload and running
on this release
>>>>>> candidate,
>>>>>> >> >> >> > then
>>>>>> >> >> >> > reporting any regressions.
>>>>>> >> >> >> >
>>>>>> >> >> >> > If you're working in PySpark you
can set up a virtual env
>>>>>> and
>>>>>> >> >> >> > install
>>>>>> >> >> >> > the current RC and see if anything
important breaks, in the
>>>>>> >> >> >> > Java/Scala
>>>>>> >> >> >> > you can add the staging repository
to your projects
>>>>>> resolvers and
>>>>>> >> >> >> > test
>>>>>> >> >> >> > with the RC (make sure to clean up
the artifact cache
>>>>>> before/after
>>>>>> >> >> >> > so
>>>>>> >> >> >> > you don't end up building with a
out of date RC going
>>>>>> forward).
>>>>>> >> >> >> >
>>>>>> >> >> >> > ===========================================
>>>>>> >> >> >> > What should happen to JIRA tickets
still targeting 2.3.1?
>>>>>> >> >> >> > ===========================================
>>>>>> >> >> >> >
>>>>>> >> >> >> > The current list of open tickets
targeted at 2.3.1 can be
>>>>>> found
>>>>>> >> >> >> > at:
>>>>>> >> >> >> > https://s.apache.org/Q3Uo
>>>>>> >> >> >> >
>>>>>> >> >> >> > Committers should look at those and
triage. Extremely
>>>>>> important
>>>>>> >> >> >> > bug
>>>>>> >> >> >> > fixes, documentation, and API tweaks
that impact
>>>>>> compatibility
>>>>>> >> >> >> > should
>>>>>> >> >> >> > be worked on immediately. Everything
else please retarget
>>>>>> to an
>>>>>> >> >> >> > appropriate release.
>>>>>> >> >> >> >
>>>>>> >> >> >> > ==================
>>>>>> >> >> >> > But my bug isn't fixed?
>>>>>> >> >> >> > ==================
>>>>>> >> >> >> >
>>>>>> >> >> >> > In order to make timely releases,
we will typically not
>>>>>> hold the
>>>>>> >> >> >> > release unless the bug in question
is a regression from the
>>>>>> >> >> >> > previous
>>>>>> >> >> >> > release. That being said, if there
is something which is a
>>>>>> >> >> >> > regression
>>>>>> >> >> >> > that has not been correctly targeted
please ping me or a
>>>>>> committer
>>>>>> >> >> >> > to
>>>>>> >> >> >> > help target the issue.
>>>>>> >> >> >> >
>>>>>> >> >> >> >
>>>>>> >> >> >> > --
>>>>>> >> >> >> > Marcelo
>>>>>> >> >> >>
>>>>>> >> >> >>
>>>>>> >> >> >>
>>>>>> >> >> >> --
>>>>>> >> >> >> Marcelo
>>>>>> >> >> >>
>>>>>> >> >> >>
>>>>>> >> >> >> ------------------------------------------------------------
>>>>>> ---------
>>>>>> >> >> >> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>> >> >> >>
>>>>>> >> >> >
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >> --
>>>>>> >> >> Marcelo
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Marcelo
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Marcelo
>>>>>>
>>>>>
>>>
>>


-- 
John

Mime
View raw message