spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Gautier (JIRA)" <>
Subject [jira] [Created] (SPARK-15620) creates a dataset that can't be self-joined
Date Fri, 27 May 2016 19:32:13 GMT
Tim Gautier created SPARK-15620:

             Summary: creates a dataset that can't be self-joined
                 Key: SPARK-15620
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.6.1
         Environment: EC2, Spark-shell
            Reporter: Tim Gautier

Given this case class and Dataset:
case class Test(id: Int)
val test = Seq(

'test' can be joined with itself successfully
{code}"t1").joinWith("t2"), $"" === $"").show

However, mapping 'test' like this
val testMapped = => t.copy(id = + 1))
results in a new Dataset that can't be joined to itself
{code}"t1").joinWith("t2"), $"" === $"").show
scala>"t1").joinWith("t2"), $"" === $"").show
org.apache.spark.sql.AnalysisException: cannot resolve '' given input columns: [id];

This also throws an error:
val testMapped2 ="t1").joinWith("t2"), $"t1.value" === $"t2.value").show

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message