spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Herman van Hövell tot Westerflier <hvanhov...@databricks.com>
Subject Re: Whole-stage codegen and SparkPlan.newPredicate
Date Mon, 01 Jan 2018 19:12:06 GMT
Wrong ticket: https://issues.apache.org/jira/browse/SPARK-22935

Thanks for working on this :)

On Mon, Jan 1, 2018 at 2:22 PM, Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com>
wrote:

> I ran the program in URL of stackoverflow with Spark 2.2.1 and master. I
> cannot see the exception even when I disabled whole-stage codegen. Am I
> wrong?
> We would appreciate it if you could create a JIRA entry with simple
> standalone repro.
>
> In addition to this report, I realized that this program produces
> incorrect results. I created a JIRA entry https://issues.apache.org/
> jira/browse/SPARK-22934.
>
> Best Regards,
> Kazuaki Ishizaki
>
>
>
> From:        Herman van Hövell tot Westerflier <hvanhovell@databricks.com>
> To:        Jacek Laskowski <jacek@japila.pl>
> Cc:        dev <dev@spark.apache.org>
> Date:        2017/12/31 21:44
> Subject:        Re: Whole-stage codegen and SparkPlan.newPredicate
> ------------------------------
>
>
>
> Hi Jacek,
>
> In this case whole stage code generation is turned off. However we still
> use code generation for a lot of other things: projections, predicates,
> orderings & encoders. You are currently seeing a compile time failure while
> generating a predicate. There is currently no easy way to turn code
> generation off entirely.
>
> The error itself is not great, but it still captures the problem in a
> relatively timely fashion. We should have caught this during analysis
> though. Can you file a ticket?
>
> - Herman
>
> On Sat, Dec 30, 2017 at 9:16 AM, Jacek Laskowski <*jacek@japila.pl*
> <jacek@japila.pl>> wrote:
> Hi,
>
> While working on an issue with Whole-stage codegen as reported @
> *https://stackoverflow.com/q/48026060/1305344*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_q_48026060_1305344&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=RvT_q8DEXf0WcHO5PmpYKTonkiLk_CRCFDWIwHR7b1o&e=>I
> found out that spark.sql.codegen.wholeStage=false does *not* turn
> whole-stage codegen off completely.
>
>
> It looks like SparkPlan.newPredicate [1] gets called regardless of the
> value of spark.sql.codegen.wholeStage property.
>
> $ ./bin/spark-shell --conf spark.sql.codegen.wholeStage=false
> ...
> scala> spark.sessionState.conf.wholeStageEnabled
> res7: Boolean = false
>
> That leads to an issue in the SO question with whole-stage codegen
> regardless of the value:
>
> ...
>   at org.apache.spark.sql.execution.SparkPlan.
> newPredicate(SparkPlan.scala:385)
>   at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(
> basicPhysicalOperators.scala:214)
>   at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(
> basicPhysicalOperators.scala:213)
>   at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal
> $1$$anonfun$apply$24.apply(RDD.scala:816)
> ...
>
> Is this a bug or does it work as intended? Why?
>
> [1]
> *https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala?utf8=%E2%9C%93#L386*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_blob_master_sql_core_src_main_scala_org_apache_spark_sql_execution_SparkPlan.scala-3Futf8-3D-25E2-259C-2593-23L386&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=vHxnoCNIUEN3ubKGZGGsWbkAxPDM5sbLewYKVLgwDY8&e=>
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> *https://about.me/JacekLaskowski*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__about.me_JacekLaskowski&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=LH71LLLzVggGx5f1T9hE7BVortdTN6qh-Ji3OQGsfMY&e=>
> Mastering Spark SQL *https://bit.ly/mastering-spark-sql*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bit.ly_mastering-2Dspark-2Dsql&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=MxnWpkT9RJKvNrpPAFnfceiOl14n7CJ0SiRzWZc9nRA&e=>
> Spark Structured Streaming *https://bit.ly/spark-structured-streaming*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bit.ly_spark-2Dstructured-2Dstreaming&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=ssYJfznmDoI8I3uEOZW2r9sKaKw1fEJiD_DU2mzPg24&e=>
> Mastering Apache Spark 2 *https://bit.ly/mastering-apache-spark*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bit.ly_mastering-2Dapache-2Dspark&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=ZVqCRb9jygZwP8pTDorfMLRtQSPmal3P_HdgvPJ6_Qo&e=>
> Follow me at *https://twitter.com/jaceklaskowski*
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_jaceklaskowski&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LNc_s5Vc87PNQAWCeE9iuJVzWzNEBkgolWvuze48L7k&s=hf9Zczq71Qh0vcGJs8iYL5mG_M6rB-a6IEXeGtaIkfI&e=>
>
>
>
>

Mime
View raw message