spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <>
Subject [SS] Why does ConsoleSink's addBatch convert input DataFrame to show it?
Date Fri, 07 Jul 2017 08:43:16 GMT

Just noticed that the input DataFrame is collect'ed and then
parallelize'd simply to show it to the console [1]. Why so many fairly
expensive operations for show?

I'd appreciate some help understanding this code. Thanks.


Jacek Laskowski
Mastering Apache Spark 2
Follow me at

To unsubscribe e-mail:

View raw message