Spark would be much faster on process_local instead of node_local. Node_local references data from local harddisk, process_local references data from in-memory thread. 

Mayur Rustagi
Ph: +1 (760) 203 3257

On Tue, Apr 22, 2014 at 4:45 PM, Joe L <> wrote:
I got the following performance is it normal in spark to be like this. some
times spark switchs into node_local mode from process_local and it becomes
10x faster. I am very confused.

scala> val a = sc.textFile("/user/exobrain/batselem/LUBM1000")
scala> f.count()

Long = 137805557
took 130.809661618 s

View this message in context:
Sent from the Apache Spark User List mailing list archive at