Not sure if I understood exactly what you need, but you could have one partition by line. Alternatively you could use the MultipleOutput format in Hadoop.

On 20. Jan 2018, at 22:56, pooja bhojwani <poojabhojwani10@gmail.com> wrote:

Hi all,

So, I have a Java Pair RDD with let’s say n lines, each line has a unique key and a hash map as the value(there are no duplicate keys). I want to save each line as a separate text file and since saveAsTextFile is not serializable, I need to somehow split the RDD into n RDD’s or so and save each of them with key as the name. I am using Java and I am kind of stuck with this from a long time. Anyone got a clue?


Thanks,
Pooja