spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Darabos <daniel.dara...@lynxanalytics.com>
Subject Re: Quick one on evaluation
Date Thu, 03 Aug 2017 09:53:50 GMT
On Wed, Aug 2, 2017 at 2:16 PM, Jean Georges Perrin <jgp@jgp.net> wrote:

> Hi Sparkians,
>
> I understand the lazy evaluation mechanism with transformations and
> actions. My question is simpler: 1) are show() and/or printSchema()
> actions? I would assume so...
>

show() is an action (it prints data) but printSchema() is not an action.
Spark can tell you the schema of the result without computing the result.

and optional question: 2) is there a way to know if there are
> transformations "pending"?
>

There are always transformations pending :). An RDD or DataFrame is a
series of pending transformations. If you say val df =
spark.read.csv("foo.csv"), that is a pending transformation. Even
spark.emptyDataFrame is best understood as a pending transformation: it
does not do anything on the cluster, but records locally what it will have
to do on the cluster.

Mime
View raw message