spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Java vs. Scala for Spark
Date Tue, 08 Sep 2015 14:48:13 GMT
Sean:
w.r.t. performance, I meant Scala/Java vs Python.

Cheers

On Tue, Sep 8, 2015 at 7:28 AM, Sean Owen <sowen@cloudera.com> wrote:

> Why would Scala vs Java performance be different Ted? Relatively
> speaking there is almost no runtime difference; it's the same APIs or
> calls via a thin wrapper. Scala/Java vs Python is a different story.
>
> Java libraries can be used in Scala. Vice-versa too, though calling
> Scala-generated classes can be clunky in Java. What's your concern
> about interoperability Jeffrey?
>
> I disagree that Java 7 vs Scala usability is sooo different, but it's
> certainly much more natural to use Spark in Scala. Java 8 closes a lot
> of the usability gap with Scala, but not all of it. Enough that it's
> not crazy for a Java shop to stick to Java 8 + Spark and not be at a
> big disadvantage.
>
> The downsides of Scala IMHO are that it provides too much: lots of
> nice features (closures! superb collections!), lots of rope to hang
> yourself too (implicits sometimes!) and some WTF features (XML
> literals!) Learning the good useful bits of Scala isn't hard. You can
> always write Scala code as much like Java as you like, I find.
>
> Scala tooling is different from Java tooling; that's an
> underappreciated barrier. For example I think SBT is good for
> development, bad for general project lifecycle management compared to
> Maven, but in any event still less developed. SBT/scalac are huge
> resource hogs, since so much of Scala is really implemented in the
> compiler; prepare to update your laptop to develop in Scala on your
> IDE of choice, and start to think about running long-running compile
> servers like we did in the year 2000.
>
> Still net-net I would choose Scala, FWIW.
>
> On Tue, Sep 8, 2015 at 3:07 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > Performance wise, Scala is by far the best choice when you use Spark.
> >
> > The cost of learning Scala is not negligible but not insurmountable
> either.
> >
> > My personal opinion.
> >
> > On Tue, Sep 8, 2015 at 6:50 AM, Bryan Jeffrey <bryan.jeffrey@gmail.com>
> > wrote:
> >>
> >> All,
> >>
> >> We're looking at language choice in developing a simple streaming
> >> processing application in spark.  We've got a small set of example code
> >> built in Scala.  Articles like the following:
> >>
> http://www.bigdatatidbits.cc/2015/02/navigating-from-scala-to-spark-for.html
> >> would seem to indicate that Scala is great for use in distributed
> >> programming (including Spark).  However, there is a large group of folks
> >> that seem to feel that interoperability with other Java libraries is
> much to
> >> be desired, and that the cost of learning (yet another) language is
> quite
> >> high.
> >>
> >> Has anyone looked at Scala for Spark dev in an enterprise environment?
> >> What was the outcome?
> >>
> >> Regards,
> >>
> >> Bryan Jeffrey
> >
> >
>

Mime
View raw message