spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liquan Pei <liquan...@gmail.com>
Subject Re: Is there a way to look at RDD's lineage? Or debug a fault-tolerance error?
Date Wed, 08 Oct 2014 19:14:33 GMT
There is a toDebugString method in rdd that will print a description of
this RDD and its recursive dependencies for debugging.

Thanks,
Liquan

On Wed, Oct 8, 2014 at 12:01 PM, Sung Hwan Chung <codedeft@cs.stanford.edu>
wrote:

> My job is not being fault-tolerant (e.g., when there's a fetch failure or
> something).
>
> The lineage of RDDs are constantly updated every iteration. However, I
> think that when there's a failure, the lineage information is not being
> correctly reapplied.
>
> It goes something like this:
>
> val rawRDD = read(...)
> val repartRDD = rawRDD.repartition(X)
>
> val tx1 = repartRDD.map(...)
> var tx2 = tx1.map(...)
>
> while (...) {
>   tx2 = tx1.zip(tx2).map(...)
> }
>
>
> Is there any way to monitor RDD's lineage, maybe even including? I want to
> make sure that there's no unexpected things happening.
>



-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst

Mime
View raw message