spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 付雅丹 <yadanfu1...@gmail.com>
Subject How to write mapreduce programming in spark by using java on user-defined javaPairRDD?
Date Tue, 07 Jul 2015 14:18:56 GMT
Hi, everyone!

I've got <key,value> pair in form of <LongWritable, Text>, where I used the
following code:

SparkConf conf = new SparkConf().setAppName("MapReduceFileInput");
JavaSparkContext sc = new JavaSparkContext(conf);
Configuration confHadoop = new Configuration();

JavaPairRDD<LongWritable,Text> sourceFile=sc.newAPIHadoopFile(
"hdfs://cMaster:9000/wcinput/data.txt",
DataInputFormat.class,LongWritable.class,Text.class,confHadoop);

Now I want to handle the javapairrdd data from <LongWritable, Text> to
another <LongWritable, Text>, where the Text content is different. After
that, I want to write Text into hdfs in order of LongWritable value. But I
don't know how to write mapreduce function in spark using java language.
Someone can help me?


Sincerely,
Missie.

Mime
View raw message