spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan <ryan.hd....@gmail.com>
Subject Re: Spark GroupBy Save to different files
Date Sat, 02 Sep 2017 02:28:43 GMT
you may try foreachPartition

On Fri, Sep 1, 2017 at 10:54 PM, asethia <sethia.arun@gmail.com> wrote:

> Hi,
>
> I have list of person records in following format:
>
> case class Person(fName:String, city:String)
>
> val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1"))
>
> val rdd:RDD[Person]=sc.parallelize(l)
>
> val groupBy:RDD[(String, Iterable[Person])]=rdd.groupBy(_.city)
>
> I would like to save these group by records in different files (for example
> by city). Please can some one help me here.
>
> I tried this but not able to create those files
>
>  groupBy.foreach(x=>{
>     x._2.toList.toDF().rdd.saveAsObjectFile(s"file:///tmp/files/${x._1}")
>   })
>
> Thanks
> Arun
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message