spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Guller <>
Subject RE: How to use groupByKey and CqlPagingInputFormat
Date Thu, 03 Jul 2014 20:29:14 GMT

1) The first map contains the columns in the primary key, which could be a compound primary
key containing multiple columns,  and the second map contains all the non-key columns.
2) try this fixed code:
    val navnrevmap ={
      case (key, value) =>


-----Original Message-----
From: Martin Gammelsæter [] 
Sent: Wednesday, July 2, 2014 4:36 AM
Subject: How to use groupByKey and CqlPagingInputFormat


Total Scala and Spark noob here with a few questions.

I am trying to modify a few of the examples in the spark repo to fit my needs, but running
into a few problems.

I am making an RDD from Cassandra, which I've finally gotten to work, and trying to do some
operations on it. Specifically I am trying to do a grouping by key for future calculations.
I want the key to be the column "navn" from a certain column family, but I don't think I understand
the returned types. Why are two Maps returned, instead of one? I'd think that you'd get a
list of some kind with every row, where every element in the list was a map from column name
to the value. So my first question is: What do these maps represent?

   val casRdd = sc.newAPIHadoopRDD(job.getConfiguration(),

    val navnrevmap ={
      case (key, value) =>

The second question (probably stemming from my not understanding the first question) is why
am I not allowed to do a groupByKey in the above code? I understand that the type does not
have that function, but I'm unclear on what I have to do to make it work.

Best regards,
Martin Gammelsæter
View raw message