spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: toDebugString is clipped
Date Sun, 13 Nov 2016 12:50:27 GMT
I believe it's the shell (Scala shell) that's cropping the output. See
http://blog.ssanj.net/posts/2016-10-16-output-in-scala-repl-is-truncated.html

On Sun, Nov 13, 2016 at 1:56 AM Anirudh Perugu <
anirudh.perugu@stonybrook.edu> wrote:

> Hello all,
>
> I am trying to understanding how graphx works internally.
>
> I created a small program in graphx :
> 1. I create a new graph
> val graph: Graph[(String, Double), Int] = Graph(vertexRDD, edgeRDD)
> 2. Now I want to see how my vertices were created, hence I use
> scala> graph.vertices.toDebugString
> res11: String =
> (48) VertexRDDImpl[11] at RDD at VertexRDD.scala:57 []
>  |   VertexRDD, VertexRDD ZippedPartitionsRDD2[9] at zipPartitions at
> VertexRDD.scala:322 []
>  |       CachedPartitions: 48; MemorySize: 328.0 KB;
> ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B
>  |   ShuffledRDD[5] at partitionBy at VertexRDD.scala:319 []
>  +-(48) ParallelCollectionRDD[0] at parallelize at <console>:45 []
>  |   MapPartitionsRDD[8] at mapPartitions at VertexRDD.scala:361 []
>  |   ShuffledRDD[7] at partitionBy at VertexRDD.scala:361 []
>  +-(48) VertexRDD.createRoutingTables - vid2pid (aggregation)
> MapPartitionsRDD[6] at mapPartitions at VertexRDD.scala:356 []
>     |   EdgeRDD, EdgeRDD MapPartitionsRDD[2] at mapPartitionsWithIndex at
> EdgeRDD.scala:105 []
>     |   ParallelCollectionRDD[1] at parallelize at <cons...
> scala>
> But this doesn't give me the whole picture as you can see it is clipped
> (10 lines I guess is the default),
> (a) is there an option to increase this number so that I can see the whole
> output.
> (b) i know that indentations indicate a shuffle boundary & the parentheses
> indicate parallelism at each step of this physical plan so does this mean
> the above can be put into a picture like :
> RDD A (VertexRDD.cre..) [48 partitions]
>                                           \
>                                              --- RDD C (VertexRDD,
> VertexRDD Zipped...)[48 partitions]
>                                           /
> RDD B (ParallelCollecti..) [48 partitions]
>
> I am fairly new to spark, so please feel free to correct!
>
> Thanks
> Anirudh
>

Mime
View raw message