spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-31660) Dataset.joinWith supports JoinType object as input parameter
Date Sun, 10 May 2020 06:10:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-31660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hyukjin Kwon resolved SPARK-31660.
----------------------------------
    Resolution: Duplicate

  I think SPARK-26739 includes this.

> Dataset.joinWith supports JoinType object as input parameter
> ------------------------------------------------------------
>
>                 Key: SPARK-31660
>                 URL: https://issues.apache.org/jira/browse/SPARK-31660
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.5
>            Reporter: Rex Xiong
>            Priority: Minor
>
> Current Dataset.joinWith API accepts String type joinType, it doesn't support JoinType
object.
> I prefer JoinType object (like enum) than String, less chance to have typo and has better
readability
> {code:scala}
> def joinWith[U](other: Dataset[U], condition: Column, joinType: String): Dataset[(T,
U)] = {{code}
> https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
> If I pass LeftOuter.sql to joinType, it will throw exception, since there's a white space
in LeftOuter.sql
> {code:scala}
> case object LeftOuter extends JoinType {
>   override def sql: String = "LEFT OUTER"
> }
> {code}
> While the constructor of JoinType only removes underscore, doesn't handle white spaces,

> {code:scala}
> object JoinType {
>   def apply(typ: String): JoinType = typ.toLowerCase(Locale.ROOT).replace("_", "") match
{
>     case "inner" => Inner
>     case "outer" | "full" | "fullouter" => FullOuter
>     case "leftouter" | "left" => LeftOuter
>     case "rightouter" | "right" => RightOuter
>     case "leftsemi" | "semi" => LeftSemi
>     case "leftanti" | "anti" => LeftAnti
>     case "cross" => Cross
>     case _ =>
>       val supported = Seq(
>         "inner",
>         "outer", "full", "fullouter", "full_outer",
>         "leftouter", "left", "left_outer",
>         "rightouter", "right", "right_outer",
>         "leftsemi", "left_semi", "semi",
>         "leftanti", "left_anti", "anti",
>         "cross")
>       throw new IllegalArgumentException(s"Unsupported join type '$typ'. " +
>         "Supported join types include: " + supported.mkString("'", "', '", "'") + ".")
>   }
> }{code}
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala
> I suggest we either add another set of APIs which provide JoinType instead of String,
or change JoinType.apply to remove white space as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message