spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Saving each line of RDD as a separate file with key as the file name
Date Sat, 20 Jan 2018 22:03:43 GMT
Not sure if I understood exactly what you need, but you could have one partition by line. Alternatively
you could use the MultipleOutput format in Hadoop.

> On 20. Jan 2018, at 22:56, pooja bhojwani <poojabhojwani10@gmail.com> wrote:
> 
> Hi all,
> 
> So, I have a Java Pair RDD with let’s say n lines, each line has a unique key and a
hash map as the value(there are no duplicate keys). I want to save each line as a separate
text file and since saveAsTextFile is not serializable, I need to somehow split the RDD into
n RDD’s or so and save each of them with key as the name. I am using Java and I am kind
of stuck with this from a long time. Anyone got a clue?
> 
> 
> Thanks,
> Pooja

Mime
View raw message