spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-27692) Optimize evaluation of udf that is deterministic and has literal inputs
Date Mon, 16 Mar 2020 22:52:07 GMT

     [ https://issues.apache.org/jira/browse/SPARK-27692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-27692:
----------------------------------
    Affects Version/s:     (was: 3.0.0)
                       3.1.0

> Optimize evaluation of udf that is deterministic and has literal inputs
> -----------------------------------------------------------------------
>
>                 Key: SPARK-27692
>                 URL: https://issues.apache.org/jira/browse/SPARK-27692
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Sunitha Kambhampati
>            Priority: Major
>
> Deterministic UDF is a udf for which the following is true:  Given a specific input,
the output of the udf will be the same no matter how many times you execute the udf.
> When your inputs to the UDF are all literal and UDF is deterministic, we can optimize
this to evaluate the udf once and use the output instead of evaluating the UDF each time for
every row in the query. 
> This is valid only if the UDF is deterministic and inputs are literal.  Otherwise we
should not and cannot apply this optimization. 
> *Testing:* 
> We have used this internally and have seen significant performance improvements for some
very expensive UDFs ( as expected).
> In the PR, I have added unit tests. 
> *Credits:* 
> Thanks to Guy Khazma([https://github.com/guykhazma]) from the IBM Haifa Research Team
for the idea and the original implementation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message