spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shijiaxin <shijiaxin...@gmail.com>
Subject Re: configuration needed to run twitter(25GB) dataset
Date Fri, 01 Aug 2014 04:40:39 GMT
Is it possible to reduce the number of edge partitions and exploit
parallelism fully at the same time?
For example, one partition per node, and the threads in the same node share
the same partition.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/configuration-needed-to-run-twitter-25GB-dataset-tp11044p11126.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message