hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Kambatla <kkamb...@cs.purdue.edu>
Subject Re: TableMapReduceUtil parallel puts to same row
Date Tue, 27 Jul 2010 16:57:20 GMT
Hi Jean

I looked at the TableMapReduceUtil code and I implemented my own version of
TableOutputFormat to find and isolate the problem.

In TableOutputFormat, table.setAutoFlush(true) is called so the writes can
be batch-written. In our case, there are multiple puts on the same row in
the batch and only few of them are getting committed. I removed that line in
MyOutputFormat, and most of the commits go through.

What is the expected behavior in the following case?

ArrayList<Put> puts = new ArrayList<Put>();

Put p1 = new Put(Bytes.toBytes(0));
p1.add(family, column, Bytes.toBytes(1));
puts.add(p1);

Put p2 = new Put(Bytes.toBytes(0));
p2.add(family, column, Bytes.toBytes(2));
puts.add(p2);

table.put(puts);

Thanks
Karthik



On Tue, Jul 27, 2010 at 9:25 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> TableOutputFormat is really just a wrapper around a HTable, see for
> yourself
> http://github.com/apache/hbase/blob/0.20/src/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
>
> So there must be something else about the way you use it, or the way
> you use HTable directly. Showing bits of your code could be helpful.
>
> J-D
>
> On Mon, Jul 26, 2010 at 11:17 PM, Karthik Kambatla
> <kkambatl@cs.purdue.edu> wrote:
> > Hi
> >
> > I am experiencing a few problems with TableMapReduceUtil, where in only
> some
> > of the puts from the reduce are written to the output table. If I
> explicitly
> > write to the table from within reduce without using TableMapReduceUtil,
> all
> > the puts are written to the table.
> >
> > In our application, multiple puts could be on the same row. In case two
> puts
> > are on the same key, our application requires both puts to be committed
> as
> > two different versions.
> >
> > Am I missing something here? Is there a cleaner way to approach this
> issue?
> >
> > Thanks for the help.
> >
> > Karthik
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message