spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <>
Subject Re: how to start reading the spark source code?
Date Mon, 20 Jul 2015 04:27:09 GMT
e5c4cd8a5e188592f8786a265 was from 2011.

Not sure why you started with such an early commit.

Spark project has evolved quite fast.

I suggest you clone Spark project from and start
with core/src/main/scala/org/apache/spark/rdd/RDD.scala


On Sun, Jul 19, 2015 at 7:44 PM, Yang <> wrote:

> I'm trying to understand how spark works under the hood, so I tried to
> read the source code.
> as I normally do, I downloaded the git source code, reverted to the very
> first version ( actually e5c4cd8a5e188592f8786a265c0cd073c69ac886 since the
> first version even lacked the definition of RDD.scala)
> but the code looks "too simple" and I can't find where the "magic"
> happens, i.e. a transformation /computation is scheduled on  a machine,
> bytes stored etc.
> it would be great if someone could show me a path in which the different
> source files are involved, so that I could read each of them in turn.
> thanks!
> yang

View raw message