spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Chammas <nicholas.cham...@gmail.com>
Subject Re: Downloading Hadoop from s3://spark-related-packages/
Date Mon, 02 Nov 2015 00:08:06 GMT
Hmm, yeah, some Googling confirms this, though there isn't any clear
documentation about this.

Strangely, if I click on the link from your email the download works, but
curl and wget somehow don't get redirected correctly...

Nick

On Sun, Nov 1, 2015 at 6:40 PM Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> I think the lua one at
>
> https://svn.apache.org/repos/asf/infrastructure/site/trunk/content/dyn/closer.lua
> has replaced the cgi one from before. Also it looks like the lua one
> also supports `action=download` with a filename argument. So you could
> just do something like
>
> wget
> http://www.apache.org/dyn/closer.lua?filename=hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz&action=download
>
> Thanks
> Shivaram
>
> On Sun, Nov 1, 2015 at 3:18 PM, Nicholas Chammas
> <nicholas.chammas@gmail.com> wrote:
> > Oh, sweet! For example:
> >
> >
> http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz?asjson=1
> >
> > Thanks for sharing that tip. Looks like you can also use as_json (vs.
> > asjson).
> >
> > Nick
> >
> >
> > On Sun, Nov 1, 2015 at 5:32 PM Shivaram Venkataraman
> > <shivaram@eecs.berkeley.edu> wrote:
> >>
> >> On Sun, Nov 1, 2015 at 2:16 PM, Nicholas Chammas
> >> <nicholas.chammas@gmail.com> wrote:
> >> > OK, I’ll focus on the Apache mirrors going forward.
> >> >
> >> > The problem with the Apache mirrors, if I am not mistaken, is that you
> >> > cannot use a single URL that automatically redirects you to a working
> >> > mirror
> >> > to download Hadoop. You have to pick a specific mirror and pray it
> >> > doesn’t
> >> > disappear tomorrow.
> >> >
> >> > They don’t go away, especially http://mirror.ox.ac.uk , and in the us
> >> > the
> >> > apache.osuosl.org, osu being a where a lot of the ASF servers are
> kept.
> >> >
> >> > So does Apache offer no way to query a URL and automatically get the
> >> > closest
> >> > working mirror? If I’m installing HDFS onto servers in various EC2
> >> > regions,
> >> > the best mirror will vary depending on my location.
> >> >
> >> Not sure if this is officially documented somewhere but if you pass
> >> '&asjson=1' you will get back a JSON which has a 'preferred' field set
> >> to the closest mirror.
> >>
> >> Shivaram
> >> > Nick
> >> >
> >> >
> >> > On Sun, Nov 1, 2015 at 12:25 PM Shivaram Venkataraman
> >> > <shivaram@eecs.berkeley.edu> wrote:
> >> >>
> >> >> I think that getting them from the ASF mirrors is a better strategy
> in
> >> >> general as it'll remove the overhead of keeping the S3 bucket up to
> >> >> date. It works in the spark-ec2 case because we only support a
> limited
> >> >> number of Hadoop versions from the tool. FWIW I don't have write
> >> >> access to the bucket and also haven't heard of any plans to support
> >> >> newer versions in spark-ec2.
> >> >>
> >> >> Thanks
> >> >> Shivaram
> >> >>
> >> >> On Sun, Nov 1, 2015 at 2:30 AM, Steve Loughran <
> stevel@hortonworks.com>
> >> >> wrote:
> >> >> >
> >> >> > On 1 Nov 2015, at 03:17, Nicholas Chammas
> >> >> > <nicholas.chammas@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> > https://s3.amazonaws.com/spark-related-packages/
> >> >> >
> >> >> > spark-ec2 uses this bucket to download and install HDFS on
> clusters.
> >> >> > Is
> >> >> > it
> >> >> > owned by the Spark project or by the AMPLab?
> >> >> >
> >> >> > Anyway, it looks like the latest Hadoop install available on there
> is
> >> >> > Hadoop
> >> >> > 2.4.0.
> >> >> >
> >> >> > Are there plans to add newer versions of Hadoop for use by
> spark-ec2
> >> >> > and
> >> >> > similar tools, or should we just be getting that stuff via an
> Apache
> >> >> > mirror?
> >> >> > The latest version is 2.7.1, by the way.
> >> >> >
> >> >> >
> >> >> > you should be grabbing the artifacts off the ASF and then verifying
> >> >> > their
> >> >> > SHA1 checksums as published on the ASF HTTPS web site
> >> >> >
> >> >> >
> >> >> > The problem with the Apache mirrors, if I am not mistaken, is
that
> >> >> > you
> >> >> > cannot use a single URL that automatically redirects you to a
> working
> >> >> > mirror
> >> >> > to download Hadoop. You have to pick a specific mirror and pray
it
> >> >> > doesn't
> >> >> > disappear tomorrow.
> >> >> >
> >> >> >
> >> >> > They don't go away, especially http://mirror.ox.ac.uk , and in
> the us
> >> >> > the
> >> >> > apache.osuosl.org, osu being a where a lot of the ASF servers
are
> >> >> > kept.
> >> >> >
> >> >> > full list with availability stats
> >> >> >
> >> >> > http://www.apache.org/mirrors/
> >> >> >
> >> >> >
>

Mime
View raw message