spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Chammas <nicholas.cham...@gmail.com>
Subject Re: saving rdd to multiple files named by the key
Date Tue, 27 Jan 2015 17:15:07 GMT
There is also SPARK-3533 <https://issues.apache.org/jira/browse/SPARK-3533>,
which proposes to add a convenience method for this.
‚Äč

On Mon Jan 26 2015 at 10:38:56 PM Aniket Bhatnagar <
aniket.bhatnagar@gmail.com> wrote:

> This might be helpful:
> http://stackoverflow.com/questions/23995040/write-to-multiple-outputs-by-key-spark-one-spark-job
>
> On Tue Jan 27 2015 at 07:45:18 Sharon Rapoport <sharon@plaid.com> wrote:
>
>> Hi,
>>
>> I have an rdd of [k,v] pairs. I want to save each [v] to a file named [k].
>> I got them by combining many [k,v] by [k]. I could then save to file by
>> partitions, but that still doesn't allow me to choose the name, and leaves
>> me stuck with foo/part-0000...
>>
>> Any tips?
>>
>> Thanks,
>> Sharon
>>
>

Mime
View raw message