spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Guedes <>
Subject Re: toDebugString - RDD Logical Plan
Date Sat, 20 Apr 2019 18:41:33 GMT
the `toDebugString` looks unformatted because in some scenarios you need to
parse it before (can't remember the reason, though). I suggest you to print
the RDD Lineage using
`print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs
in Pyspark).

About the other question, you may use `getNumberPartitions`.

On Sat, Apr 20, 2019 at 2:40 PM kanchan tewary <>

> Dear All,
> Greetings!
> I am new to Apache Spark and working on RDDs using pyspark. I am trying to
> understand the logical plan provided by toDebugString function, but I find
> two issues a) the output is not formatted when I print the result
> b) I do not see number of partitions shown.
> Can anyone direct me to any reference documentation to understand the
> logical plan better? Or, do you suggest to use DAG from spark UI instead?
> Thanks & Best Regards,
> Kanchan
> Data Engineer, IBM

View raw message