spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: saveAsTextFile to s3 on spark does not work, just hangs
Date Mon, 25 Aug 2014 20:34:04 GMT
One other idea - when things freeze up, try to run jstack on the spark
shell process and on the executors and attach the results. It could be that
somehow you are encountering a deadlock somewhere.


On Mon, Aug 25, 2014 at 1:26 PM, Matei Zaharia <matei.zaharia@gmail.com>
wrote:

> Was the original issue with Spark 1.1 (i.e. master branch) or an earlier
> release?
>
> One possibility is that your S3 bucket is in a remote Amazon region, which
> would make it very slow. In my experience though saveAsTextFile has worked
> even for pretty large datasets in that situation, so maybe there's
> something else in your job causing a problem. Have you tried other
> operations on the data, like count(), or saving synthetic datasets (e.g.
> sc.parallelize(1 to 100*1000*1000, 20).saveAsTextFile(...)?
>
> Matei
>
> On August 25, 2014 at 12:09:25 PM, amnonkhen (amnon.is@gmail.com) wrote:
>
> Hi jerryye,
> Maybe if you voted up my question on Stack Overflow it would get some
> traction and we would get nearer to a solution.
> Thanks,
> Amnon
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795p7991.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message