Thank you for the quick responses. It's useful to have some insight from folks already extensively using Spark.Regards,Bryan JeffreyOn Tue, Sep 8, 2015 at 10:28 AM, Sean Owen <firstname.lastname@example.org> wrote:Why would Scala vs Java performance be different Ted? Relatively
speaking there is almost no runtime difference; it's the same APIs or
calls via a thin wrapper. Scala/Java vs Python is a different story.
Java libraries can be used in Scala. Vice-versa too, though calling
Scala-generated classes can be clunky in Java. What's your concern
about interoperability Jeffrey?
I disagree that Java 7 vs Scala usability is sooo different, but it's
certainly much more natural to use Spark in Scala. Java 8 closes a lot
of the usability gap with Scala, but not all of it. Enough that it's
not crazy for a Java shop to stick to Java 8 + Spark and not be at a
The downsides of Scala IMHO are that it provides too much: lots of
nice features (closures! superb collections!), lots of rope to hang
yourself too (implicits sometimes!) and some WTF features (XML
literals!) Learning the good useful bits of Scala isn't hard. You can
always write Scala code as much like Java as you like, I find.
Scala tooling is different from Java tooling; that's an
underappreciated barrier. For example I think SBT is good for
development, bad for general project lifecycle management compared to
Maven, but in any event still less developed. SBT/scalac are huge
resource hogs, since so much of Scala is really implemented in the
compiler; prepare to update your laptop to develop in Scala on your
IDE of choice, and start to think about running long-running compile
servers like we did in the year 2000.
Still net-net I would choose Scala, FWIW.
On Tue, Sep 8, 2015 at 3:07 PM, Ted Yu <email@example.com> wrote:
> Performance wise, Scala is by far the best choice when you use Spark.
> The cost of learning Scala is not negligible but not insurmountable either.
> My personal opinion.
> On Tue, Sep 8, 2015 at 6:50 AM, Bryan Jeffrey <firstname.lastname@example.org>
>> We're looking at language choice in developing a simple streaming
>> processing application in spark. We've got a small set of example code
>> built in Scala. Articles like the following:
>> would seem to indicate that Scala is great for use in distributed
>> programming (including Spark). However, there is a large group of folks
>> that seem to feel that interoperability with other Java libraries is much to
>> be desired, and that the cost of learning (yet another) language is quite
>> Has anyone looked at Scala for Spark dev in an enterprise environment?
>> What was the outcome?
>> Bryan Jeffrey