spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amnonkhen <>
Subject saveAsTextFile to s3 on spark does not work, just hangs
Date Sun, 10 Aug 2014 10:41:28 GMT
I am loading a csv text file from s3 into spark, filtering and mapping the
records and writing the result to s3.

I have tried several input sizes: 100k rows, 1M rows & 3.5M rows. The former
two finish successfully while the latter (3.5M rows) hangs in some weird
state in which the job stages monitor web app (the one in port 4040) stops ,
and the command line console gets stuck and does not even respond to ctrl-c.
The Master's web monitoring app still responds and shows the state as

In s3, I see an empty directory with a single zero-sized entry
_temporary_$folder$. The s3 url is given using the s3n:// protocol.

I did not see any error in the logs in the web console. I also tried several
cluster sizes (1 master + 1 worker, 1 master + 5 workers) and got to the
same state.

Has anyone encountered such an issue? Any idea what's going on?

I also posted this question to Stack Overflow:

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message