spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Partitioning strategy
Date Sun, 02 Apr 2017 10:32:13 GMT

I have RDD with 4 years’ data with suppose 20 partitions. On runtime, user can decide to
select few months or years of RDD. That means, based upon user time selection RDD is being
filtered and on filtered RDD further transformations and actions are performed. And, as spark
says, child RDD get partitions from parent RDD.

Therefore, is there any way to decide partitioning strategy after filter operations?

Jasbir Singh


This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Where allowed by local law, electronic communications with Accenture and its affiliates, including
e-mail and instant messaging (including content), may be scanned by our systems for the purposes
of information security and assessment of internal compliance with Accenture policy.
View raw message