Gentle ping: spark-1.6.1-bin-hadoop2.4.tgz from S3 is still corrupt.

On Wed, Apr 6, 2016 at 12:55 PM, Josh Rosen <joshrosen@databricks.com> wrote:
Sure, I'll take a look. Planning to do full verification in a bit.

On Wed, Apr 6, 2016 at 12:54 PM Ted Yu <yuzhihong@gmail.com> wrote:
Josh:
Can you check spark-1.6.1-bin-hadoop2.4.tgz ?

$ tar zxf spark-1.6.1-bin-hadoop2.4.tgz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

$ ls -l !$
ls -l spark-1.6.1-bin-hadoop2.4.tgz
-rw-r--r--. 1 hbase hadoop 323614720 Apr  5 19:25 spark-1.6.1-bin-hadoop2.4.tgz

Thanks

On Wed, Apr 6, 2016 at 12:19 PM, Josh Rosen <joshrosen@databricks.com> wrote:
I downloaded the Spark 1.6.1 artifacts from the Apache mirror network and re-uploaded them to the spark-related-packages S3 bucket, so hopefully these packages should be fixed now.

On Mon, Apr 4, 2016 at 3:37 PM Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
Thanks, that was the command. :thumbsup:

On Mon, Apr 4, 2016 at 6:28 PM Jakob Odersky <jakob@odersky.com> wrote:
I just found out how the hash is calculated:

gpg --print-md sha512 <spark-archive>.tgz

you can use that to check if the resulting output matches the contents
of <spark-archive>.tgz.sha

On Mon, Apr 4, 2016 at 3:19 PM, Jakob Odersky <jakob@odersky.com> wrote:
> The published hash is a SHA512.
>
> You can verify the integrity of the packages by running `sha512sum` on
> the archive and comparing the computed hash with the published one.
> Unfortunately however, I don't know what tool is used to generate the
> hash and I can't reproduce the format, so I ended up manually
> comparing the hashes.
>
> On Mon, Apr 4, 2016 at 2:39 PM, Nicholas Chammas
> <nicholas.chammas@gmail.com> wrote:
>> An additional note: The Spark packages being served off of CloudFront (i.e.
>> the “direct download” option on spark.apache.org) are also corrupt.
>>
>> Btw what’s the correct way to verify the SHA of a Spark package? I’ve tried
>> a few commands on working packages downloaded from Apache mirrors, but I
>> can’t seem to reproduce the published SHA for spark-1.6.1-bin-hadoop2.6.tgz.
>>
>>
>> On Mon, Apr 4, 2016 at 11:45 AM Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>> Maybe temporarily take out the artifacts on S3 before the root cause is
>>> found.
>>>
>>> On Thu, Mar 24, 2016 at 7:25 AM, Nicholas Chammas
>>> <nicholas.chammas@gmail.com> wrote:
>>>>
>>>> Just checking in on this again as the builds on S3 are still broken. :/
>>>>
>>>> Could it have something to do with us moving release-build.sh?
>>>>
>>>>
>>>> On Mon, Mar 21, 2016 at 1:43 PM Nicholas Chammas
>>>> <nicholas.chammas@gmail.com> wrote:
>>>>>
>>>>> Is someone going to retry fixing these packages? It's still a problem.
>>>>>
>>>>> Also, it would be good to understand why this is happening.
>>>>>
>>>>> On Fri, Mar 18, 2016 at 6:49 PM Jakob Odersky <jakob@odersky.com> wrote:
>>>>>>
>>>>>> I just realized you're using a different download site. Sorry for the
>>>>>> confusion, the link I get for a direct download of Spark 1.6.1 /
>>>>>> Hadoop 2.6 is
>>>>>> http://d3kbcqa49mib13.cloudfront.net/spark-1.6.1-bin-hadoop2.6.tgz
>>>>>>
>>>>>> On Fri, Mar 18, 2016 at 3:20 PM, Nicholas Chammas
>>>>>> <nicholas.chammas@gmail.com> wrote:
>>>>>> > I just retried the Spark 1.6.1 / Hadoop 2.6 download and got a
>>>>>> > corrupt ZIP
>>>>>> > file.
>>>>>> >
>>>>>> > Jakob, are you sure the ZIP unpacks correctly for you? Is it the same
>>>>>> > Spark
>>>>>> > 1.6.1/Hadoop 2.6 package you had a success with?
>>>>>> >
>>>>>> > On Fri, Mar 18, 2016 at 6:11 PM Jakob Odersky <jakob@odersky.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> I just experienced the issue, however retrying the download a second
>>>>>> >> time worked. Could it be that there is some load balancer/cache in
>>>>>> >> front of the archive and some nodes still serve the corrupt
>>>>>> >> packages?
>>>>>> >>
>>>>>> >> On Fri, Mar 18, 2016 at 8:00 AM, Nicholas Chammas
>>>>>> >> <nicholas.chammas@gmail.com> wrote:
>>>>>> >> > I'm seeing the same. :(
>>>>>> >> >
>>>>>> >> > On Fri, Mar 18, 2016 at 10:57 AM Ted Yu <yuzhihong@gmail.com>
>>>>>> >> > wrote:
>>>>>> >> >>
>>>>>> >> >> I tried again this morning :
>>>>>> >> >>
>>>>>> >> >> $ wget
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >> https://s3.amazonaws.com/spark-related-packages/spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >> --2016-03-18 07:55:30--
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >> https://s3.amazonaws.com/spark-related-packages/spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >> Resolving s3.amazonaws.com... 54.231.19.163
>>>>>> >> >> ...
>>>>>> >> >> $ tar zxf spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >>
>>>>>> >> >> gzip: stdin: unexpected end of file
>>>>>> >> >> tar: Unexpected EOF in archive
>>>>>> >> >> tar: Unexpected EOF in archive
>>>>>> >> >> tar: Error is not recoverable: exiting now
>>>>>> >> >>
>>>>>> >> >> On Thu, Mar 17, 2016 at 8:57 AM, Michael Armbrust
>>>>>> >> >> <michael@databricks.com>
>>>>>> >> >> wrote:
>>>>>> >> >>>
>>>>>> >> >>> Patrick reuploaded the artifacts, so it should be fixed now.
>>>>>> >> >>>
>>>>>> >> >>> On Mar 16, 2016 5:48 PM, "Nicholas Chammas"
>>>>>> >> >>> <nicholas.chammas@gmail.com>
>>>>>> >> >>> wrote:
>>>>>> >> >>>>
>>>>>> >> >>>> Looks like the other packages may also be corrupt. I’m getting
>>>>>> >> >>>> the
>>>>>> >> >>>> same
>>>>>> >> >>>> error for the Spark 1.6.1 / Hadoop 2.4 package.
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>> https://s3.amazonaws.com/spark-related-packages/spark-1.6.1-bin-hadoop2.4.tgz
>>>>>> >> >>>>
>>>>>> >> >>>> Nick
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>> On Wed, Mar 16, 2016 at 8:28 PM Ted Yu <yuzhihong@gmail.com>
>>>>>> >> >>>> wrote:
>>>>>> >> >>>>>
>>>>>> >> >>>>> On Linux, I got:
>>>>>> >> >>>>>
>>>>>> >> >>>>> $ tar zxf spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >>>>>
>>>>>> >> >>>>> gzip: stdin: unexpected end of file
>>>>>> >> >>>>> tar: Unexpected EOF in archive
>>>>>> >> >>>>> tar: Unexpected EOF in archive
>>>>>> >> >>>>> tar: Error is not recoverable: exiting now
>>>>>> >> >>>>>
>>>>>> >> >>>>> On Wed, Mar 16, 2016 at 5:15 PM, Nicholas Chammas
>>>>>> >> >>>>> <nicholas.chammas@gmail.com> wrote:
>>>>>> >> >>>>>>
>>>>>> >> >>>>>>
>>>>>> >> >>>>>>
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> https://s3.amazonaws.com/spark-related-packages/spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> Does anyone else have trouble unzipping this? How did this
>>>>>> >> >>>>>> happen?
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> What I get is:
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> $ gzip -t spark-1.6.1-bin-hadoop2.6.tgz
>>>>>> >> >>>>>> gzip: spark-1.6.1-bin-hadoop2.6.tgz: unexpected end of file
>>>>>> >> >>>>>> gzip: spark-1.6.1-bin-hadoop2.6.tgz: uncompress failed
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> Seems like a strange type of problem to come across.
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> Nick
>>>>>> >> >>>>>
>>>>>> >> >>>>>
>>>>>> >> >>
>>>>>> >> >
>>>
>>>
>>