spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Spark or MR, Scala or Java?
Date Sat, 22 Nov 2014 16:26:45 GMT
MapReduce is simpler and narrower, which also means it is generally lighter
weight, with less to know and configure, and runs more predictably. If you
have a job that is truly just a few maps, with maybe one reduce, MR will
likely be more efficient. Until recently its shuffle has been more
developed and offers some semantics the Spark shuffle does not.

I suppose it integrates with tools like Oozie, that Spark does not.

I suggest learning enough Scala to use Spark in Scala. The amount you need
to know is not large.

(Mahout MR based implementations do not run on Spark and will not. They
have been removed instead.)
On Nov 22, 2014 3:36 PM, "Guillermo Ortiz" <> wrote:

> Hello,
> I'm a newbie with Spark but I've been working with Hadoop for a while.
> I have two questions.
> Is there any case where MR is better than Spark? I don't know what
> cases I should be used Spark by MR. When is MR faster than Spark?
> The other question is, I know Java, is it worth it to learn Scala for
> programming to Spark or it's okay just with Java? I have done a little
> piece of code with Java because I feel more confident with it,, but I
> seems that I'm missed something
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message