spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: [SQL] Dataset.map gives error: missing parameter type for expanded function?
Date Mon, 04 Apr 2016 18:05:47 GMT
It is called groupByKey now.  Similar to joinWith, the schema produced by
relational joins and aggregations is different than what you would expect
when working with objects.  So, when combining DataFrame+Dataset we renamed
these functions to make this distinction clearer.

On Sun, Apr 3, 2016 at 12:23 PM, Jacek Laskowski <jacek@japila.pl> wrote:

> Hi,
>
> (since 2.0.0-SNAPSHOT it's more for dev not user)
>
> With today's master I'm getting the following:
>
> scala> ds
> res14: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
>
> // WHY?!
> scala> ds.groupBy(_._1)
> <console>:26: error: missing parameter type for expanded function
> ((x$1) => x$1._1)
>        ds.groupBy(_._1)
>                   ^
>
> scala> ds.filter(_._1.size > 10)
> res23: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
>
> It's even on the slide of Michael in
> https://youtu.be/i7l3JQRx7Qw?t=7m38s from Spark Summit East?! Am I
> doing something wrong? Please guide.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
View raw message