spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 박주형 <dkdka...@gmail.com>
Subject How can I join two DataSet of same case class?
Date Fri, 11 Mar 2016 05:09:24 GMT
Hi. I want to join two DataSet. but below stderr is shown

16/03/11 13:55:51 WARN ColumnName: Constructing trivially true equals predicate, ''edid =
'edid'. Perhaps you need to use aliases.
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'edid' given
input columns dataType, avg, sigma, countUnique, numRows, recentEdid, categoryId, accCount,
statType, categoryId, max, accCount, firstQuarter, recentEdid, replicationRateAvg, numRows,
min, countNotNull, countNotNull, dcid, numDistinctRows, max, firstQuarter, min, replicationRateAvg,
dcid, statType, avg, sigma, dataType, median, thirdQuarter, numDistinctRows, median, countUnique,
thirdQuarter;


my case class is 
case class Stat(statType: Int, dataType: Int, dcid: Int, 
    categoryId: Int, recentEdid: Int, countNotNull: Int, countUnique: Int, accCount: Int,
replicationRateAvg: Double,
    numDistinctRows: Double, numRows: Double, 
    min: Double, max: Double, sigma: Double, avg: Double, 
    firstQuarter: Double, thirdQuarter: Double, median: Double)

and my code is
a.joinWith(b, $"edid" === $"edid").show()

If i use DataFrame, renaming a’s column could solve it. How can I join two DataSet of same
case class?
Mime
View raw message