Spark would be much faster on process_local instead of node_local. Node_local references data from local harddisk, process_local references data from in-memory thread. 

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi



On Tue, Apr 22, 2014 at 4:45 PM, Joe L <selmee_j@yahoo.com> wrote:
I got the following performance is it normal in spark to be like this. some
times spark switchs into node_local mode from process_local and it becomes
10x faster. I am very confused.

scala> val a = sc.textFile("/user/exobrain/batselem/LUBM1000")
scala> f.count()

Long = 137805557
took 130.809661618 s




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.