spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Chammas <nicholas.cham...@gmail.com>
Subject Re: Downloading Hadoop from s3://spark-related-packages/
Date Sun, 01 Nov 2015 22:16:39 GMT
OK, I’ll focus on the Apache mirrors going forward.

The problem with the Apache mirrors, if I am not mistaken, is that you
cannot use a single URL that automatically redirects you to a working
mirror to download Hadoop. You have to pick a specific mirror and pray it
doesn’t disappear tomorrow.

They don’t go away, especially http://mirror.ox.ac.uk , and in the us the
apache.osuosl.org, osu being a where a lot of the ASF servers are kept.

So does Apache offer no way to query a URL and automatically get the
closest working mirror? If I’m installing HDFS onto servers in various EC2
regions, the best mirror will vary depending on my location.

Nick
​

On Sun, Nov 1, 2015 at 12:25 PM Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> I think that getting them from the ASF mirrors is a better strategy in
> general as it'll remove the overhead of keeping the S3 bucket up to
> date. It works in the spark-ec2 case because we only support a limited
> number of Hadoop versions from the tool. FWIW I don't have write
> access to the bucket and also haven't heard of any plans to support
> newer versions in spark-ec2.
>
> Thanks
> Shivaram
>
> On Sun, Nov 1, 2015 at 2:30 AM, Steve Loughran <stevel@hortonworks.com>
> wrote:
> >
> > On 1 Nov 2015, at 03:17, Nicholas Chammas <nicholas.chammas@gmail.com>
> > wrote:
> >
> > https://s3.amazonaws.com/spark-related-packages/
> >
> > spark-ec2 uses this bucket to download and install HDFS on clusters. Is
> it
> > owned by the Spark project or by the AMPLab?
> >
> > Anyway, it looks like the latest Hadoop install available on there is
> Hadoop
> > 2.4.0.
> >
> > Are there plans to add newer versions of Hadoop for use by spark-ec2 and
> > similar tools, or should we just be getting that stuff via an Apache
> mirror?
> > The latest version is 2.7.1, by the way.
> >
> >
> > you should be grabbing the artifacts off the ASF and then verifying their
> > SHA1 checksums as published on the ASF HTTPS web site
> >
> >
> > The problem with the Apache mirrors, if I am not mistaken, is that you
> > cannot use a single URL that automatically redirects you to a working
> mirror
> > to download Hadoop. You have to pick a specific mirror and pray it
> doesn't
> > disappear tomorrow.
> >
> >
> > They don't go away, especially http://mirror.ox.ac.uk , and in the us
> the
> > apache.osuosl.org, osu being a where a lot of the ASF servers are kept.
> >
> > full list with availability stats
> >
> > http://www.apache.org/mirrors/
> >
> >
>

Mime
View raw message