spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Dave <>
Subject Re: Effecient way to fetch all records on a particular node/partition in GraphX
Date Sun, 17 May 2015 18:45:29 GMT
If you know the partition IDs, you can launch a job that runs tasks on only
those partitions by calling sc.runJob
For example, we do this in IndexedRDD
to get particular keys without launching a task on every partition.

Ankur <>

On Sun, May 17, 2015 at 8:32 AM, mas <> wrote:

> I have distributed my RDD into say 10 nodes. I want to fetch the data that
> resides on a particular node say "node 5". How i can achieve this?
> I have tried mapPartitionWithIndex function to filter the data of that
> corresponding node, however it is pretty expensive.

View raw message