spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jerryye <jerr...@gmail.com>
Subject Re: saveAsTextFile to s3 on spark does not work, just hangs
Date Mon, 25 Aug 2014 20:37:24 GMT
Hi Matei,
At least in my case, the s3 bucket is in the same region. Running count()
works and so does generating synthetic data. What I saw was that the job
would hang for over an hour with no progress but tasks would immediately
start finishing if I cached the data.

- jerry


On Mon, Aug 25, 2014 at 1:26 PM, Matei Zaharia [via Apache Spark Developers
List] <ml-node+s1001551n8000h39@n3.nabble.com> wrote:

> Was the original issue with Spark 1.1 (i.e. master branch) or an earlier
> release?
>
> One possibility is that your S3 bucket is in a remote Amazon region, which
> would make it very slow. In my experience though saveAsTextFile has worked
> even for pretty large datasets in that situation, so maybe there's
> something else in your job causing a problem. Have you tried other
> operations on the data, like count(), or saving synthetic datasets (e.g.
> sc.parallelize(1 to 100*1000*1000, 20).saveAsTextFile(...)?
>
> Matei
>
> On August 25, 2014 at 12:09:25 PM, amnonkhen ([hidden email]
> <http://user/SendEmail.jtp?type=node&node=8000&i=0>) wrote:
>
> Hi jerryye,
> Maybe if you voted up my question on Stack Overflow it would get some
> traction and we would get nearer to a solution.
> Thanks,
> Amnon
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795p7991.html
>
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=8000&i=1>
> For additional commands, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=8000&i=2>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795p8000.html
>  To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n1h70@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=amVycnl5ZUBnbWFpbC5jb218MXwtNTI4OTc1MTAz>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795p8003.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message