I think you have to use an alias. To provide an alias to a Dataset:

val d1 = a.as("d1")
val d2 = b.as("d2")

Then join, using the alias in the column names:
d1.joinWith(d2, $"d1.edid" === $"d2.edid")

Finally, please doublecheck your column names. I did not see "edid" in your case class.


On Thu, Mar 10, 2016 at 9:09 PM, 박주형 <dkdkajej@gmail.com> wrote:
Hi. I want to join two DataSet. but below stderr is shown

16/03/11 13:55:51 WARN ColumnName: Constructing trivially true equals predicate, ''edid = 'edid'. Perhaps you need to use aliases.
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'edid' given input columns dataType, avg, sigma, countUnique, numRows, recentEdid, categoryId, accCount, statType, categoryId, max, accCount, firstQuarter, recentEdid, replicationRateAvg, numRows, min, countNotNull, countNotNull, dcid, numDistinctRows, max, firstQuarter, min, replicationRateAvg, dcid, statType, avg, sigma, dataType, median, thirdQuarter, numDistinctRows, median, countUnique, thirdQuarter;

my case class is 
case class Stat(statType: Int, dataType: Int, dcid: Int, 
    categoryId: Int, recentEdid: Int, countNotNull: Int, countUnique: Int, accCount: Int, replicationRateAvg: Double,
    numDistinctRows: Double, numRows: Double, 
    min: Double, max: Double, sigma: Double, avg: Double, 
    firstQuarter: Double, thirdQuarter: Double, median: Double)

and my code is
a.joinWith(b$"edid" === $"edid").show()

If i use DataFrame, renaming a’s column could solve it. How can I join two DataSet of same case class?