spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From colzer <471519...@qq.com>
Subject Re: Writing all values for same key to one file
Date Fri, 05 Aug 2016 07:00:27 GMT
In my opinion,"Append to a file" maybe is not good idea. 
By using `MultipleTextOutputFormat`, you can append all values for a given
key  to a directory

for example:

   class RDDMultipleTextOutputFormat extends MultipleTextOutputFormat[Any,
Any] {
      override def generateFileNameForKeyValue(key: Any, value: Any, name:
String): String =
         key.asInstanceOf[String] + "/" + System.currentTimeMillis() //may
by you can use stream time
      override def generateActualKey(key: Any, value: Any):Any ={
        return null
      }
    }

    val sc = new SparkContext(new
SparkConf().set("spark.hadoop.validateOutputSpecs", "false"))
    sc.parallelize(Array("1","2","3"),3)
      .map(a=>(a,a))
      .saveAsHadoopFile("/Users/tmp", classOf[String], classOf[String],
classOf[RDDMultipleTextOutputFormat])






--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Writing-all-values-for-same-key-to-one-file-tp27455p27486.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message