spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zachary S Ennenga (Jira)" <>
Subject [jira] [Commented] (SPARK-28889) Allow UDTs to define custom casting behavior
Date Fri, 30 Aug 2019 18:42:00 GMT


Zachary S Ennenga commented on SPARK-28889:

While I understand if the spark team is not particularly interested in solving this problem
themselves at this time, I'm more concerned with understanding if this is in line with the
eventual solution to UDTs and datasets. If it is, I'm about halfway through the PR as is,
and I'm happy to complete it.

> Allow UDTs to define custom casting behavior
> --------------------------------------------
>                 Key: SPARK-28889
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.3
>            Reporter: Zachary S Ennenga
>            Priority: Minor
> Looking at `org.apache.spark.sql.catalyst.expressions.Cast`, UDTs do not support any
sort of casting except for identity casts, IE:
> {code:java}
> case (udt1: UserDefinedType[_], udt2: UserDefinedType[_]) if udt1.userClass == udt2.userClass
>  true
> {code}
> I propose we add an additional piece of functionality here to allow UDTs to define their
own canCast and cast functions to allow users to define their own cast mechanisms.
> An example of how this might look:
> {code:java}
> case (fromType, toType: UserDefinedType[_]) =>
>  toType.canCast(fromType) // Returns boolean
> {code}
> {code:java}
> case (fromType, toType: UserDefinedType[_]) =>
>  toType.cast(fromType) // Returns Casting function
> {code}
> The UDT base class would contain a default implementation that replicates current behavior
(IE no casting).

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message