spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kant kodali <>
Subject Re: Scala Vs Python
Date Thu, 01 Sep 2016 14:57:26 GMT
c'mon man this is no Brainer..Dynamic Typed Languages for Large Code Bases or
Large Scale Distributed Systems makes absolutely no sense. I can write a 10 page
essay on why that wouldn't work so great. you might be wondering why would spark
have it then? well probably because its ease of use for ML (that would be my
best guess).

On Wed, Aug 31, 2016 11:45 PM, AssafMendelson
I believe this would greatly depend on your use case and your familiarity with
the languages.

In general, scala would have a much better performance than python and not all
interfaces are available in python.

That said, if you are planning to use dataframes without any UDF then the
performance hit is practically nonexistent.

Even if you need UDF, it is possible to write those in scala and wrap them for
python and still get away without the performance hit.

Python does not have interfaces for UDAFs.

I believe that if you have large structured data and do not generally need
UDF/UDAF you can certainly work in python without losing too much.

From:  ayan guha [mailto:[hidden email]]
Sent:  Thursday, September 01, 2016 5:03 AM
To:  user
Subject:  Scala Vs Python

Hi Users

Thought to ask (again and again) the question: While I am building any
production application, should I use Scala or Python?

I have read many if not most articles but all seems pre-Spark 2. Anything
changed with Spark 2? Either pro-scala way or pro-python way?

I am thinking performance, feature parity and future direction, not so much in
terms of skillset or ease of use.

Or, if you think it is a moot point, please say so as well.

Any real life example, production experience, anecdotes, personal taste,
profanity all are welcome :)


Best Regards,
Ayan Guha


View this message in context: RE: Scala Vs Python
Sent from the Apache Spark User List mailing list archive  at
View raw message