spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: saveAsTextFile to s3 on spark does not work, just hangs
Date Mon, 25 Aug 2014 20:26:02 GMT
Was the original issue with Spark 1.1 (i.e. master branch) or an earlier release?

One possibility is that your S3 bucket is in a remote Amazon region, which would make it very
slow. In my experience though saveAsTextFile has worked even for pretty large datasets in
that situation, so maybe there's something else in your job causing a problem. Have you tried
other operations on the data, like count(), or saving synthetic datasets (e.g. sc.parallelize(1
to 100*1000*1000, 20).saveAsTextFile(...)?


On August 25, 2014 at 12:09:25 PM, amnonkhen ( wrote:

Hi jerryye, 
Maybe if you voted up my question on Stack Overflow it would get some 
traction and we would get nearer to a solution. 

View this message in context:

Sent from the Apache Spark Developers List mailing list archive at 

To unsubscribe, e-mail: 
For additional commands, e-mail: 

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message