spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Why's ds.foreachPartition(println) not possible?
Date Tue, 05 Jul 2016 20:27:09 GMT
I don't think that's a scala compiler bug.

println is a valid expression that returns unit.

Unit is not a single-argument function, and does not match any of the
overloads of foreachPartition

You may be used to a conversion taking place when println is passed to
method expecting a function, but that's not a safe thing to do
silently for multiple overloads.

tldr;

just use

ds.foreachPartition(x => println(x))

you don't need any type annotations


On Tue, Jul 5, 2016 at 2:53 PM, Jacek Laskowski <jacek@japila.pl> wrote:
> Hi Reynold,
>
> Is this already reported and tracked somewhere. I'm quite sure that
> people will be asking about the reasons Spark does this. Where are
> such issues reported usually?
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Tue, Jul 5, 2016 at 6:19 PM, Reynold Xin <rxin@databricks.com> wrote:
>> This seems like a Scala compiler bug.
>>
>>
>> On Tuesday, July 5, 2016, Jacek Laskowski <jacek@japila.pl> wrote:
>>>
>>> Well, there is foreach for Java and another foreach for Scala. That's
>>> what I can understand. But while supporting two language-specific APIs
>>> -- Scala and Java -- Dataset API lost support for such simple calls
>>> without type annotations so you have to be explicit about the variant
>>> (since I'm using Scala I want to use Scala API right). It appears that
>>> any single-argument-function operators in Datasets are affected :(
>>>
>>> My question was to know whether there are works to fix it (if possible
>>> -- I don't know if it is).
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> ----
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Tue, Jul 5, 2016 at 4:21 PM, Sean Owen <sowen@cloudera.com> wrote:
>>> > Right, should have noticed that in your second mail. But foreach
>>> > already does what you want, right? it would be identical here.
>>> >
>>> > How these two methods do conceptually different things on different
>>> > arguments. I don't think I'd expect them to accept the same functions.
>>> >
>>> > On Tue, Jul 5, 2016 at 3:18 PM, Jacek Laskowski <jacek@japila.pl>
wrote:
>>> >> ds is Dataset and the problem is that println (or any other
>>> >> one-element function) would not work here (and perhaps other methods
>>> >> with two variants - Java's and Scala's).
>>> >>
>>> >> Pozdrawiam,
>>> >> Jacek Laskowski
>>> >> ----
>>> >> https://medium.com/@jaceklaskowski/
>>> >> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>>> >> Follow me at https://twitter.com/jaceklaskowski
>>> >>
>>> >>
>>> >> On Tue, Jul 5, 2016 at 3:53 PM, Sean Owen <sowen@cloudera.com>
wrote:
>>> >>> A DStream is a sequence of RDDs, not of elements. I don't think
I'd
>>> >>> expect to express an operation on a DStream as if it were elements.
>>> >>>
>>> >>> On Tue, Jul 5, 2016 at 2:47 PM, Jacek Laskowski <jacek@japila.pl>
>>> >>> wrote:
>>> >>>> Sort of. Your example works, but could you do a mere
>>> >>>> ds.foreachPartition(println)? Why not? What should I even see
the
>>> >>>> Java
>>> >>>> version?
>>> >>>>
>>> >>>> scala> val ds = spark.range(10)
>>> >>>> ds: org.apache.spark.sql.Dataset[Long] = [id: bigint]
>>> >>>>
>>> >>>> scala> ds.foreachPartition(println)
>>> >>>> <console>:26: error: overloaded method value foreachPartition
with
>>> >>>> alternatives:
>>> >>>>   (func:
>>> >>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Long])Unit
>>> >>>> <and>
>>> >>>>   (f: Iterator[Long] => Unit)Unit
>>> >>>>  cannot be applied to (Unit)
>>> >>>>        ds.foreachPartition(println)
>>> >>>>           ^
>>> >>>>
>>> >>>> Pozdrawiam,
>>> >>>> Jacek Laskowski
>>> >>>> ----
>>> >>>> https://medium.com/@jaceklaskowski/
>>> >>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>>> >>>> Follow me at https://twitter.com/jaceklaskowski
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Jul 5, 2016 at 3:32 PM, Sean Owen <sowen@cloudera.com>
wrote:
>>> >>>>> Do you not mean ds.foreachPartition(_.foreach(println))
or similar?
>>> >>>>>
>>> >>>>> On Tue, Jul 5, 2016 at 2:22 PM, Jacek Laskowski <jacek@japila.pl>
>>> >>>>> wrote:
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> It's with the master built today. Why can't I call
>>> >>>>>> ds.foreachPartition(println)? Is using type annotation
the only way
>>> >>>>>> to
>>> >>>>>> go forward? I'd be so sad if that's the case.
>>> >>>>>>
>>> >>>>>> scala> ds.foreachPartition(println)
>>> >>>>>> <console>:28: error: overloaded method value foreachPartition
with
>>> >>>>>> alternatives:
>>> >>>>>>   (func:
>>> >>>>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Record])Unit
>>> >>>>>> <and>
>>> >>>>>>   (f: Iterator[Record] => Unit)Unit
>>> >>>>>>  cannot be applied to (Unit)
>>> >>>>>>        ds.foreachPartition(println)
>>> >>>>>>           ^
>>> >>>>>>
>>> >>>>>> scala> sc.version
>>> >>>>>> res9: String = 2.0.0-SNAPSHOT
>>> >>>>>>
>>> >>>>>> Pozdrawiam,
>>> >>>>>> Jacek Laskowski
>>> >>>>>> ----
>>> >>>>>> https://medium.com/@jaceklaskowski/
>>> >>>>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>>> >>>>>> Follow me at https://twitter.com/jaceklaskowski
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> ---------------------------------------------------------------------
>>> >>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>> >>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message