You mean this does not work?

SELECT key, count(value) from table group by key



On Sun, Jul 19, 2015 at 2:28 PM, N B <nb.nospam@gmail.com> wrote:
Hello,

How do I go about performing the equivalent of the following SQL clause in Spark Streaming? I will be using this on a Windowed DStream.

SELECT key, count(distinct(value)) from table group by key;

so for example, given the following dataset in the table:

 key | value
-----+-------
 k1  | v1
 k1  | v1
 k1  | v2
 k1  | v3
 k1  | v3
 k2  | vv1
 k2  | vv1
 k2  | vv2
 k2  | vv2
 k2  | vv2
 k3  | vvv1
 k3  | vvv1

the result will be:

 key | count
-----+-------
 k1  |     3
 k2  |     2
 k3  |     1

Thanks
Nikunj