hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel <pavli...@gmail.com>
Subject Re: help with reduce phase understanding
Date Fri, 01 Aug 2008 13:36:33 GMT
Thank you a lot for your answer Jean-Daniel. I think now I understand how
that scenario works.

I have another scenario (probably not doable with mapred thing though) - I
need to get total rows count for whole table. I think I could use Reporter
to increment a counter in map phase, but how can I get the counter value
saved into 'results' table after all? Can you please advice how can I
achieve that? Also, what is preferred way to get table row count?

Thank you for your help!
Pavel

2008/8/1 Jean-Daniel Cryans <jdcryans@gmail.com>

> Pavel,
>
> Since each map processes only one region, that a row is only stored in one
> region and that all intermediate keys from a given mapper goes to a single
> reducer, there will be no stale data in this situation.
>
> J-D
>
> On Wed, Jul 30, 2008 at 10:09 AM, Pavel <pavlikus@gmail.com> wrote:
>
> > Hi,
> >
> > I feel lack of mapreduce approach understanding and would like to ask
> some
> > questions (mainly on its reduce part). Below is reduce job that gets
> values
> > count for given row key and inserts resulting value into other table
> using
> > the same row key.
> >
> > What makes me doubt is that I cannot figure out how would that code work
> if
> > there're several redurers are running. Is it possible that they will
> > process
> > values for same row key and as consequence write stale data into the
> table?
> > Say reducerA has counted total for 5 messages while reducerB for 3
> > messages,
> > would that all end up with 8 value in resulting table?
> >
> > Thank you.
> > Pavel
> >
> > public class MessagesTableReduce extends TableReduce<Text, LongWritable>
> {
> >
> >    public void reduce(Text key, Iterator<LongWritable> values,
> >            OutputCollector<Text, MapWritable> output, Reporter reporter)
> >            throws IOException {
> >
> >        System.out.println("REDUCE: processing messages for author: " +
> > key.toString());
> >
> >        int total = 0;
> >        while (values.hasNext()) {
> >            values.next();
> >            total++;
> >        }
> >
> >        MapWritable map = new MapWritable();
> >        map.put(new Text("messages:sent"), new
> > ImmutableBytesWritable(String.valueOf(total).getBytes()));
> >        output.collect(key, map);
> >    }
> > }
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message