spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: Java 8 vs Scala
Date Wed, 15 Jul 2015 09:48:46 GMT
Why would you create a class and then instantiate it to store data and
change the class every time you have to add a new element? In OOPS
terminology a class represents an object, and an object has states - does
it not?

Purely from a data warehousing perspective - one of the fundamental
principles in delivering a DW system is to ensure a Single Version of Truth
and that is what a Functional way of thinking naturally supports.

We can say by extension that data analytics algorithms are quite in tune
with functional way of thinking and therefore Scala, whereas object
oriented way of thinking needs to adapt itself to be functional. Of course
we can use OOPS concept for delivering data solutions just like we can
implement OOPS concept in C.

Java is good for solving certain things which require OOPS rigor and Scala
mostly in problems that can use functional way of problem solving - purely
from a data processing perspective.

Those who are using performance timings to compare these two languages
should start coding in Machine Level Language and then see the performance
gains in terms of Java and MLL and should switch over to MLL. Of course MLL
is a bit more verbose than Java just as Java is a bit more verbose than
Scala and Python - but who's complaining.

Of course, these are my personal thoughts and I may be completely wrong and
will be grateful if someone could illustrate how.


Regards,
Gourav


On Wed, Jul 15, 2015 at 10:03 AM, Reinis Vicups <spark@orbit-x.de> wrote:

>  We have a complex application that runs productively for couple of
> months and heavily uses spark in scala.
>
> Just to give you some insight on complexity - we do not have such a huge
> source data (only about 500'000 complex elements), but we have more than a
> billion transformations and intermediate data elements we do with our
> machine learning algorithms.
> Our current spark/mesos cluster consists of 120 CPUs, 190 GB RAM and
> plenty of HDD space.
>
> Now regarding your question:
>
> - scala is just a beautiful language itself, it has nothing to do with
> spark;
>
> - spark api fits very naturally into scala semantics because of the
> map/reduce transformations are written more or less identicaly for local
> collections and RDDs;
>
> - as with any religious topic, there is controverse discussion on what
> language is better and most of the times (I have read quite a lot of
> blog/forum topics on this) argumentation is based on what religion one
> belongs to (e.g. Java vs Scala vs Python)
>
> - we have checked supposed performance issues and limitations of scala
> described here: (http://www.infoq.com/news/2011/11/yammer-scala) by
> re-factoring to "best practices" described in the article and have observed
> both performance increase in some places and, at the same time, performance
> decrease in other places. Thus I would say there is no noticeable
> performance difference between scala vs java in our use case (of course
> there are and always will be applications where one or other language
> performs better);
>
> hope I could help
> reinis
>
>
>
> On 15.07.2015 09:27, 诺铁 wrote:
>
> I think different team got different answer for this question.  my team
> use scala, and happy with it.
>
> On Wed, Jul 15, 2015 at 1:31 PM, Tristan Blakers <tristan@blackfrog.org>
> wrote:
>
>> We have had excellent results operating on RDDs using Java 8 with
>> Lambdas. It’s slightly more verbose than Scala, but I haven’t found this an
>> issue, and haven’t missed any functionality.
>>
>>  The new DataFrame API makes the Spark platform even more language
>> agnostic.
>>
>>  Tristan
>>
>> On 15 July 2015 at 06:40, Vineel Yalamarthy <vineelyalamarthy@gmail.com>
>> wrote:
>>
>>>   Good   question. Like  you , many are in the same boat(coming from
>>> Java background). Looking forward to response from the community.
>>>
>>>  Regards
>>>  Vineel
>>>
>>> On Tue, Jul 14, 2015 at 2:30 PM, spark user <
>>> spark_user@yahoo.com.invalid> wrote:
>>>
>>>>  Hi All
>>>>
>>>>  To Start new project in Spark , which technology is good .Java8 OR
>>>>  Scala .
>>>>
>>>>  I am Java developer , Can i start with Java 8  or I Need to learn
>>>> Scala .
>>>>
>>>>  which one is better technology  for quick start any POC project
>>>>
>>>>  Thanks
>>>>
>>>>  - su
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>  Thanks and Regards,
>>> Venkata Vineel, Student  ,School of Computing
>>>  Mobile : +1-385-2109-788
>>>
>>>  -*Innovation is the ability to convert **ideas into invoices*
>>>
>>>
>>
>
>

Mime
View raw message