spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amnonkhen <amnon...@gmail.com>
Subject saveAsTextFile to s3 on spark does not work, just hangs
Date Sun, 10 Aug 2014 10:41:28 GMT
I am loading a csv text file from s3 into spark, filtering and mapping the
records and writing the result to s3.

I have tried several input sizes: 100k rows, 1M rows & 3.5M rows. The former
two finish successfully while the latter (3.5M rows) hangs in some weird
state in which the job stages monitor web app (the one in port 4040) stops ,
and the command line console gets stuck and does not even respond to ctrl-c.
The Master's web monitoring app still responds and shows the state as
FINISHED.

In s3, I see an empty directory with a single zero-sized entry
_temporary_$folder$. The s3 url is given using the s3n:// protocol.

I did not see any error in the logs in the web console. I also tried several
cluster sizes (1 master + 1 worker, 1 master + 5 workers) and got to the
same state.

Has anyone encountered such an issue? Any idea what's going on?

I also posted this question to Stack Overflow: 
http://stackoverflow.com/questions/25226419/saveastextfile-to-s3-on-spark-does-not-work-just-hangs
<http://stackoverflow.com/questions/25226419/saveastextfile-to-s3-on-spark-does-not-work-just-hangs>
 



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message