spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jey Kottalam <>
Subject Re: Quality of documentation (rant)
Date Mon, 20 Jan 2014 22:59:35 GMT
>> This sounds like either a bug or somehow the S3 library requiring lots of
>> memory to read a block. There isn’t a separate way to run HDFS over S3.
>> Hadoop just has different implementations of “file systems”, one of which is
>> S3. There’s a pointer to these versions at the bottom of
>> but it is indeed pretty hidden in the docs.
> Hmmm. Maybe a bug then. If I read a small 600 byte file via the s3n:// uri -
> it works on a spark cluster. If I try a 20GB file it just sits and sits and
> sits frozen. Is there anything I can do to instrument this and figure out
> what is going on?

Try taking a look at the stderr log of the executor that failed. You
should hopefully see a more detailed error message there. The stderr
logs can be found by browsing to http://mymaster:8080, where
`mymaster` is the hostname of your Spark master.

Hope that helps,

View raw message