spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Spark error "value join is not a member of org.apache.spark.rdd.RDD[((String, String), String, String)]"
Date Tue, 09 Jun 2015 06:56:16 GMT
Try this way:

scala>val input1 = sc.textFile("/test7").map(line =>
line.split(",").map(_.trim));
scala>val input2 = sc.textFile("/test8").map(line =>
line.split(",").map(_.trim));
scala>val input11 = input1.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
scala>val input22 = input2.map(x=>(*(x(0) + x(1)*),x(2),x(3)))

scala> input11.join(input22).take(10)


PairFunctions basically requires RDD[K,V] and in your case its ((String,
String), String, String). You can also look in keyBy if you don't want to
concatenate your keys.

Thanks
Best Regards

On Tue, Jun 9, 2015 at 10:14 AM, amit tewari <amittewari.5@gmail.com> wrote:

> Hi Dear Spark Users
>
> I am very new to Spark/Scala.
>
> Am using Datastax (4.7/Spark 1.2.1) and struggling with following
> error/issue.
>
> Already tried options like import org.apache.spark.SparkContext._ or
> explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions.
> But error not resolved.
>
> Help much appreciated.
>
> Thanks
> AT
>
> scala>val input1 = sc.textFile("/test7").map(line =>
> line.split(",").map(_.trim));
> scala>val input2 = sc.textFile("/test8").map(line =>
> line.split(",").map(_.trim));
> scala>val input11 = input1.map(x=>((x(0),x(1)),x(2),x(3)))
> scala>val input22 = input2.map(x=>((x(0),x(1)),x(2),x(3)))
>
>  scala> input11.join(input22).take(10)
>
> <console>:66: error: value join is not a member of
> org.apache.spark.rdd.RDD[((String, String), String, String)]
>
>               input11.join(input22).take(10)
>
>
>
>
>
>
>

Mime
View raw message