the `toDebugString` looks unformatted because in some scenarios you need to parse it before (can't remember the reason, though). I suggest you to print the RDD Lineage using
`print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs in Pyspark).

About the other question, you may use `getNumberPartitions`. 

On Sat, Apr 20, 2019 at 2:40 PM kanchan tewary <kanchan.tewary@gmail.com> wrote:
Dear All,


I am new to Apache Spark and working on RDDs using pyspark. I am trying to understand the logical plan provided by toDebugString function, but I find two issues a) the output is not formatted when I print the result
b) I do not see number of partitions shown.

Can anyone direct me to any reference documentation to understand the logical plan better? Or, do you suggest to use DAG from spark UI instead?

