spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Guedes <djmggue...@gmail.com>
Subject Re: toDebugString - RDD Logical Plan
Date Sat, 20 Apr 2019 18:41:33 GMT
Kanchan,
the `toDebugString` looks unformatted because in some scenarios you need to
parse it before (can't remember the reason, though). I suggest you to print
the RDD Lineage using
`print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs
in Pyspark).

About the other question, you may use `getNumberPartitions`.

On Sat, Apr 20, 2019 at 2:40 PM kanchan tewary <kanchan.tewary@gmail.com>
wrote:

> Dear All,
>
> Greetings!
>
> I am new to Apache Spark and working on RDDs using pyspark. I am trying to
> understand the logical plan provided by toDebugString function, but I find
> two issues a) the output is not formatted when I print the result
> b) I do not see number of partitions shown.
>
> Can anyone direct me to any reference documentation to understand the
> logical plan better? Or, do you suggest to use DAG from spark UI instead?
>
>
> Thanks & Best Regards,
> Kanchan
> Data Engineer, IBM
>

Mime
View raw message