spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From im281 <>
Subject RDD flatmap to multiple key/value pairs
Date Fri, 02 Dec 2016 16:05:40 GMT
Here is a MapReduce Example implemented in Java. 
It reads each line of text and for each word in the line of text determines
if it starts 
with an upper case. If so, it creates a key value pair

public class CountUppercaseMapper
    extends Mapper<LongWritable,Text,Text,IntWritable> {
  protected void map(LongWritable lineNumber, Text line, Context context)
      throws IOException, InterruptedException {
    for (String word : line.toString().split(" ")) {
      if (Character.isUpperCase(word.charAt(0))) {
        context.write(new Text(word), new IntWritable(1));

What is the equivalent spark implementation?

A more use-case specific example below with objects:

In this case, the mapper emits multiple key:value pairs that are

What is the equivalent spark implementation?


public class IsotopeClusterMapper extends Mapper<LongWritable, 
Text, Text, Text> {

	protected void map(LongWritable key, Text value, Context context)
			throws IOException, InterruptedException {
		System.out.println("Inside Isotope Cluster Map !");
		String line = value.toString();

		// Get Isotope clusters here are write out to text
		Detector detector = new Detector();
		ArrayList<IsotopeCluster> clusters = detector.GetClusters(line);

		for (int i = 0; i < clusters.size(); i++) {
			String cKey = detector.WriteClusterKey(clusters.get(i));
			String cValue = detector.WriteClusterValue(clusters.get(i));
			context.write(new Text(cKey), new Text(cValue));

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe e-mail:

View raw message