spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Herman van Hövell tot Westerflier <hvanhov...@databricks.com>
Subject Re: Whole-stage codegen and SparkPlan.newPredicate
Date Sun, 31 Dec 2017 12:44:01 GMT
Hi Jacek,

In this case whole stage code generation is turned off. However we still
use code generation for a lot of other things: projections, predicates,
orderings & encoders. You are currently seeing a compile time failure while
generating a predicate. There is currently no easy way to turn code
generation off entirely.

The error itself is not great, but it still captures the problem in a
relatively timely fashion. We should have caught this during analysis
though. Can you file a ticket?

- Herman

On Sat, Dec 30, 2017 at 9:16 AM, Jacek Laskowski <jacek@japila.pl> wrote:

> Hi,
>
> While working on an issue with Whole-stage codegen as reported @
> https://stackoverflow.com/q/48026060/1305344 I found out
> that spark.sql.codegen.wholeStage=false does *not* turn whole-stage
> codegen off completely.
>
> It looks like SparkPlan.newPredicate [1] gets called regardless of the
> value of spark.sql.codegen.wholeStage property.
>
> $ ./bin/spark-shell --conf spark.sql.codegen.wholeStage=false
> ...
> scala> spark.sessionState.conf.wholeStageEnabled
> res7: Boolean = false
>
> That leads to an issue in the SO question with whole-stage codegen
> regardless of the value:
>
> ...
>   at org.apache.spark.sql.execution.SparkPlan.newPredicate(
> SparkPlan.scala:385)
>   at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(
> basicPhysicalOperators.scala:214)
>   at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(
> basicPhysicalOperators.scala:213)
>   at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInte
> rnal$1$$anonfun$apply$24.apply(RDD.scala:816)
> ...
>
> Is this a bug or does it work as intended? Why?
>
> [1] https://github.com/apache/spark/blob/master/sql/core/src
> /main/scala/org/apache/spark/sql/execution/SparkPlan.scala?
> utf8=%E2%9C%93#L386
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> Mastering Spark SQL https://bit.ly/mastering-spark-sql
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>

Mime
View raw message