hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: HBase Failing on Large Loads
Date Fri, 12 Jun 2009 19:07:23 GMT
Generally you want to have # map partitions = # table regions.

Then from there, you configure in the hadoop config how many to run at the
same time per machine.

On Fri, Jun 12, 2009 at 11:59 AM, stack <stack@duboce.net> wrote:

> On Fri, Jun 12, 2009 at 11:50 AM, mike anderson <saidtherobot@gmail.com
> >wrote:
>
> >
> > I'm wondering how you set up your job to run 2 maps/1 reducer per
> machine.
> > Is this a matter of adding more region servers? I currently have 1
> > regionserver and 144 regions (living on the same cluster as hadoop.
> >
>
> TableInputFormat makes as many maps as there are regions (with some
> caveats).  My guess is that you only have 4 regions in you table since you
> don't have that many rows? Your best bet is study of TIF#getSplits.  You
> could override it to get more maps or, just trust that when you have more
> data in the table, and therefore more regions, more maps will be run.
>
> On the reduce side, I'm not sure.  Check TableOutputFormat but I'd say 1
> reduce per machine is default.  In this case hbase is probably respecting
> what you have configured in your hadoop-site.xml/mapred-site.xml.
>
> St.Ack
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message