spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Dave <ankurd...@gmail.com>
Subject Re: How to correctly extimate the number of partition of a graph in GraphX
Date Sun, 02 Nov 2014 06:06:38 GMT
How large is your graph, and how much memory does your cluster have?

We don't have a good way to determine the *optimal* number of partitions
aside from trial and error, but to get the job to at least run to
completion, it might help to use the MEMORY_AND_DISK storage level and a
large number of partitions.

Ankur <http://www.ankurdave.com/>

On Sat, Nov 1, 2014 at 10:57 PM, James <alcaid1801@gmail.com> wrote:

> Hello,
>
> I am trying to run Connected Component algorithm on a very big graph. In
> practice I found that a small number of partition size would lead to OOM,
> while a large number would cause various time out exceptions. Thus I wonder
> how to estimate the number of partition of a graph in GraphX?
>
> Alcaid
>

Mime
View raw message