spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Piggott <>
Subject Spark MakeRDD preferred workers
Date Mon, 08 Jan 2018 20:51:55 GMT

def makeRDD[T](seq: Seq[(T, Seq[String])])(implicit arg0: ClassTag[T]):
    list of tuples of data and location preferences (hostnames of Spark

Is that list a list of acceptable choices, and it will choose one of them?
Or is it an ordered list?  I'm trying to ascertain how well it will
distribute if there's a lot of overlap between partitions and nodes.

In my particular case, my RDD is Seq of  (filePath, hosts[])  where hosts
are nodes on which the file's blocks are local.


View raw message