spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenchen Fan (Jira)" <>
Subject [jira] [Created] (SPARK-30127) UDF should work for case class like Dataset operations
Date Wed, 04 Dec 2019 15:33:00 GMT
Wenchen Fan created SPARK-30127:

             Summary: UDF should work for case class like Dataset operations
                 Key: SPARK-30127
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Wenchen Fan

Currently, Spark UDF can only work on data types like java.lang.String, o.a.s.sql.Row, Seq[_],
etc. This is inconvenient if you want to apply an operation on one column, and the column
is struct type. You must access data from a Row object, instead of your domain object like
Dataset operations. It will be great if UDF can work on types that are supported by Dataset,
e.g. case classes.

Note that, there are multiple ways to register a UDF, and it's only possible to support this
feature if the UDF is registered using Scala API that provides type tag, e.g. `def udf[RT:
TypeTag, A1: TypeTag](f: Function1[A1, RT])`

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message