spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amit tewari <amittewar...@gmail.com>
Subject Re: Spark error "value join is not a member of org.apache.spark.rdd.RDD[((String, String), String, String)]"
Date Tue, 09 Jun 2015 08:24:01 GMT
Actually the question was will keyBy() take accept multiple fields (eg
x(0), x(1)) as Key?

On Tue, Jun 9, 2015 at 1:07 PM, amit tewari <amittewari.5@gmail.com> wrote:

> Thanks Akhil, as you suggested, I have to go keyBy(route) as need the
> columns intact.
> But wil keyBy() take accept multiple fields (eg x(0), x(1))?
>
> Thanks
> Amit
>
> On Tue, Jun 9, 2015 at 12:26 PM, Akhil Das <akhil@sigmoidanalytics.com>
> wrote:
>
>> Try this way:
>>
>> scala>val input1 = sc.textFile("/test7").map(line =>
>> line.split(",").map(_.trim));
>> scala>val input2 = sc.textFile("/test8").map(line =>
>> line.split(",").map(_.trim));
>> scala>val input11 = input1.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
>> scala>val input22 = input2.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
>>
>> scala> input11.join(input22).take(10)
>>
>>
>> PairFunctions basically requires RDD[K,V] and in your case its ((String,
>> String), String, String). You can also look in keyBy if you don't want to
>> concatenate your keys.
>>
>> Thanks
>> Best Regards
>>
>> On Tue, Jun 9, 2015 at 10:14 AM, amit tewari <amittewari.5@gmail.com>
>> wrote:
>>
>>> Hi Dear Spark Users
>>>
>>> I am very new to Spark/Scala.
>>>
>>> Am using Datastax (4.7/Spark 1.2.1) and struggling with following
>>> error/issue.
>>>
>>> Already tried options like import org.apache.spark.SparkContext._ or
>>> explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions.
>>> But error not resolved.
>>>
>>> Help much appreciated.
>>>
>>> Thanks
>>> AT
>>>
>>> scala>val input1 = sc.textFile("/test7").map(line =>
>>> line.split(",").map(_.trim));
>>> scala>val input2 = sc.textFile("/test8").map(line =>
>>> line.split(",").map(_.trim));
>>> scala>val input11 = input1.map(x=>((x(0),x(1)),x(2),x(3)))
>>> scala>val input22 = input2.map(x=>((x(0),x(1)),x(2),x(3)))
>>>
>>>  scala> input11.join(input22).take(10)
>>>
>>> <console>:66: error: value join is not a member of
>>> org.apache.spark.rdd.RDD[((String, String), String, String)]
>>>
>>>               input11.join(input22).take(10)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Mime
View raw message