spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Guller <moham...@glassbeam.com>
Subject RE: How to use groupByKey and CqlPagingInputFormat
Date Fri, 04 Jul 2014 19:20:32 GMT
As far as I know, there is not much difference, except that the outer parenthesis is redundant.
The problem with your original code was that there was mismatch in the opening and closing
parenthesis. Sometimes the error messages are misleading :-)

Do you see any performance difference with the Datastax spark driver?

Mohammed

-----Original Message-----
From: Martin Gammelsæter [mailto:martingammelsaeter@gmail.com] 
Sent: Friday, July 4, 2014 12:43 AM
To: user@spark.apache.org
Subject: Re: How to use groupByKey and CqlPagingInputFormat

On Thu, Jul 3, 2014 at 10:29 PM, Mohammed Guller <mohammed@glassbeam.com> wrote:
> Martin,
>
> 1) The first map contains the columns in the primary key, which could be a compound primary
key containing multiple columns,  and the second map contains all the non-key columns.

Ah, thank you, that makes sense.

> 2) try this fixed code:
>     val navnrevmap = casRdd.map{
>       case (key, value) =>
>         (ByteBufferUtil.string(value.get("navn")),
>            ByteBufferUtil.toInt(value.get("revisjon")))
>        }.groupByKey()

I changed from CqlPagingInputFormat to the new Datastax cassandra-spark driver, which is a
bit easier to work with, but thanks! I'm curious though, what is the semantic difference between
map({}) and map{}?
Mime
View raw message