spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shijiaxin <shijiaxin...@gmail.com>
Subject Re: configuration needed to run twitter(25GB) dataset
Date Fri, 01 Aug 2014 09:12:08 GMT
When I use fewer partitions, (like 6)
It seems that all the task will be assigned to the same machine, because the
machine has more than 6 cores.But this will run out of memory.
How to set fewer partitions number and use all the machine at the same time?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/configuration-needed-to-run-twitter-25GB-dataset-tp11044p11150.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message