spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Samel <manojsamelt...@gmail.com>
Subject Re: groupBy RDD does not have grouping column ?
Date Mon, 31 Mar 2014 16:29:06 GMT
Thanks, that works.

It wasn't clear if the second part is just the aggregate specification or
any expression.


On Mon, Mar 31, 2014 at 9:03 AM, Michael Armbrust <michael@databricks.com>wrote:

> This is similar to how SQL works, items in the GROUP BY clause are not
> included in the output by default.  You will need to include 'a in the
> second parameter list (which is similar to the SELECT clause) as well if
> you want it included in the output.
>
>
> On Sun, Mar 30, 2014 at 9:52 PM, Manoj Samel <manojsameltech@gmail.com>wrote:
>
>> Hi,
>>
>> If I create a groupBy('a)(Sum('b) as 'foo, Sum('c) as 'bar), then the
>> resulting RDD should have 'a, 'foo and 'bar.
>>
>> The result RDD just shows 'foo and 'bar and is missing 'a
>>
>> Thoughts?
>>
>> Thanks,
>>
>> Manoj
>>
>
>

Mime
View raw message