spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uğur Sopaoğlu <usopao...@gmail.com>
Subject Re: Write to HDFS
Date Fri, 20 Oct 2017 14:01:24 GMT
Actually, when I run following code,

  val textFile = sc.textFile("Sample.txt")
  val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)


It save the results into more than one partition like part-00000,
part-00001. I want to collect all of them into one file.


2017-10-20 16:43 GMT+03:00 Marco Mistroni <mmistroni@gmail.com>:

> Hi
>  Could you just create an rdd/df out of what you want to save and store it
> in hdfs?
> Hth
>
> On Oct 20, 2017 9:44 AM, "Uğur Sopaoğlu" <usopaoglu@gmail.com> wrote:
>
>> Hi all,
>>
>> In word count example,
>>
>>   val textFile = sc.textFile("Sample.txt")
>>   val counts = textFile.flatMap(line => line.split(" "))
>>                  .map(word => (word, 1))
>>                  .reduceByKey(_ + _)
>>  counts.saveAsTextFile("hdfs://master:8020/user/abc")
>>
>> I want to write collection of "*counts" *which is used in code above to
>> HDFS, so
>>
>> val x = counts.collect()
>>
>> Actually I want to write *x *to HDFS. But spark wants to RDD to write
>> sometihng to HDFS
>>
>> How can I write Array[(String,Int)] to HDFS
>>
>>
>> --
>> Uğur
>>
>


-- 
Uğur Sopaoğlu

Mime
View raw message