spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: how to start reading the spark source code?
Date Mon, 20 Jul 2015 04:27:09 GMT
e5c4cd8a5e188592f8786a265 was from 2011.

Not sure why you started with such an early commit.

Spark project has evolved quite fast.

I suggest you clone Spark project from github.com/apache/spark/ and start
with core/src/main/scala/org/apache/spark/rdd/RDD.scala

Cheers

On Sun, Jul 19, 2015 at 7:44 PM, Yang <teddyyyy123@gmail.com> wrote:

> I'm trying to understand how spark works under the hood, so I tried to
> read the source code.
>
> as I normally do, I downloaded the git source code, reverted to the very
> first version ( actually e5c4cd8a5e188592f8786a265c0cd073c69ac886 since the
> first version even lacked the definition of RDD.scala)
>
> but the code looks "too simple" and I can't find where the "magic"
> happens, i.e. a transformation /computation is scheduled on  a machine,
> bytes stored etc.
>
> it would be great if someone could show me a path in which the different
> source files are involved, so that I could read each of them in turn.
>
> thanks!
> yang
>

Mime
View raw message