spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: Checkpointed RDD still causing StackOverflow
Date Tue, 24 Jun 2014 22:18:36 GMT
Do not call collect as that will perform materialization as well as
transfer of data to driver (might actually cause driver to fail if the data
is huge). You have to materialize the RDD in some way(call save, count,
collect).

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Tue, Jun 24, 2014 at 2:50 AM, Xiangrui Meng <mengxr@gmail.com> wrote:

> Calling checkpoint() alone doesn't cut the lineage. It only marks the
> RDD as to be checkpointed. The lineage is cut after the first time
> this RDD is materialized. You see StackOverflow becaure the lineage is
> still there. -Xiangrui
>
> On Sun, Jun 22, 2014 at 6:37 PM, dash <bshi@nd.edu> wrote:
> > Hi Xiangrui,
> >
> > According to my knowledge, calling count is for materialize the RDD, does
> > collect do the same thing since it also an action? I can not call count
> > because for a Graph object, count does not materialize the RDD. I already
> > send an issue on that.
> >
> > My question is, why there still have stack overflow even if
> `isCheckpointed`
> > is true?
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Checkpointed-RDD-still-causing-StackOverflow-tp7066p7068.html
> > Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message