spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rex Xiong (Jira)" <j...@apache.org>
Subject [jira] [Created] (SPARK-31660) Dataset.joinWith supports JoinType object as input parameter
Date Thu, 07 May 2020 17:53:00 GMT
Rex Xiong created SPARK-31660:
---------------------------------

             Summary: Dataset.joinWith supports JoinType object as input parameter
                 Key: SPARK-31660
                 URL: https://issues.apache.org/jira/browse/SPARK-31660
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.5
            Reporter: Rex Xiong


Current Dataset.joinWith API accepts String type joinType, it doesn't support JoinType object.
I prefer JoinType object (like enum) than String, less chance to have typo and has better
readability
{code:scala}
def joinWith[U](other: Dataset[U], condition: Column, joinType: String): Dataset[(T, U)] =
{{code}
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala


If I pass LeftOuter.sql to joinType, it will throw exception, since there's a white space
in LeftOuter.sql
{code:scala}
case object LeftOuter extends JoinType {
  override def sql: String = "LEFT OUTER"
}
{code}
While the constructor of JoinType only removes underscore, doesn't handle white spaces, 
{code:scala}
object JoinType {
  def apply(typ: String): JoinType = typ.toLowerCase(Locale.ROOT).replace("_", "") match {
    case "inner" => Inner
    case "outer" | "full" | "fullouter" => FullOuter
    case "leftouter" | "left" => LeftOuter
    case "rightouter" | "right" => RightOuter
    case "leftsemi" | "semi" => LeftSemi
    case "leftanti" | "anti" => LeftAnti
    case "cross" => Cross
    case _ =>
      val supported = Seq(
        "inner",
        "outer", "full", "fullouter", "full_outer",
        "leftouter", "left", "left_outer",
        "rightouter", "right", "right_outer",
        "leftsemi", "left_semi", "semi",
        "leftanti", "left_anti", "anti",
        "cross")

      throw new IllegalArgumentException(s"Unsupported join type '$typ'. " +
        "Supported join types include: " + supported.mkString("'", "', '", "'") + ".")
  }
}{code}
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala

I suggest we either add another set of APIs which provide JoinType instead of String, or change
JoinType.apply to remove white space as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message