spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thodoris Zois <>
Subject Isolate 1 partition and perform computations
Date Sat, 14 Apr 2018 22:12:41 GMT
Hello list,

I am sorry for sending this message here, but I could not manage to get any response in “users”.
For specific purposes I would like to isolate 1 partition of the RDD and perform computations
only to this. 

For instance, suppose that a user asks Spark to create 500 partitions for the RDD. I would
like Spark to create the partitions but perform computations only in one partition from those
500 ignoring the other 499. 

At first I tried to modify executor in order to run only 1 partition (task) but I didn’t
manage to make it work. Then I tried the DAG Scheduler but I think that I should modify the
code in a higher level and let Spark make the partitioning but at the end see only one partition
and throw throw away all the others.

My question is which file should I modify in order to achieve isolating 1 partition of the
RDD? Where does the actual partitioning is made?

I hope it is clear!

Thank you very much,

To unsubscribe e-mail:

View raw message