hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Slava Gorelik" <slava.gore...@gmail.com>
Subject Re: Hbase / Hadoop Tuning
Date Thu, 02 Oct 2008 19:38:44 GMT
Thank You Jim for a quick answer.
1) If i understand correct, using 2 clients should allow me improve
the performance twice (more or less) ?
2) Currently, our webapp is HBase client using Htable - is that what you
meant, when you said "(HBase, not web) clients" ?
3) 64MB for single region server is a minimum size or could be less ?
4) When is planed to fix the RPC lock for concurrent operations
in single client ?

Thank You Again and Best Regards.


On Thu, Oct 2, 2008 at 10:30 PM, Jim Kellerman (POWERSET) <
Jim.Kellerman@microsoft.com> wrote:

> What you are storing is 140,000,000 bytes, so having multiple
> region servers will not help you as a single region is only
> served by a single region server. By default, regions split
> when they reach 256MB. So until the region splits, all traffic
> will go to a single region server. You might try reducing the
> maximum file size to encourage region splitting by changing the
> value of hbase.hregion.max.filesize to 64MB.
>
> Using a single client will also limit write performance.
> Even if the client is multi-threaded, there is a big giant lock
> in the RPC mechanism which prevents concurrent requests (This
> is something we plan to fix in the future).
>
> Multiple clients do not block against one another the way multi-
> threaded clients do currently. So another way to increase
> write performance would be to run multiple (HBase, not web) clients,
> by either running multiple processes directly, or by utilizing
> a Map/Reduce job to do the writes.
>
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>
>
> > -----Original Message-----
> > From: Slava Gorelik [mailto:slava.gorelik@gmail.com]
> > Sent: Thursday, October 02, 2008 12:07 PM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: Hbase / Hadoop Tuning
> >
> > Hi.Thank you for quick response.
> > We are using 7 machines (6 RedHat 5 and 1 is SuSe interprise 10).
> > Each machine is : 4 CPU with 4gb ram and 200gb HD, connected with 1gb
> > network interface.
> > All machines in the same rec. On one machine (master) we are running
> > Tomcat
> > with one webapp
> > that is adding 100000 rows. Nothing else is running. When no webapp
> > running
> > the CPU load is less the 1%.
> >
> > We are using Hbase 0.18.0 and Hadoop 0.18.0.
> > Hbase cluster is one master and 6 region servers.
> >
> > Row addition is done by BatchUpdate and commint into single column
> family.
> > The data is simple bytes array (1400 bytes each row).
> >
> >
> > Thank You and Best Regards.
> >
> >
> >
> >
> > On Thu, Oct 2, 2008 at 9:39 PM, stack <stack@duboce.net> wrote:
> >
> > > Tell us more Slava.  HBase versions and how many regions you have in
> > your
> > > cluster?
> > >
> > > If small rows, your best boost will likely come when we support
> batching
> > of
> > > updates: HBASE-748.
> > >
> > > St.Ack
> > >
> > >
> > >
> > > Slava Gorelik wrote:
> > >
> > >> Hi All.
> > >> Our environment - 8 Datanodes (1 is also Namenode),
> > >> 7 from them is also region servers and 1 is Master, default
> replication
> > -
> > >> 3.
> > >> We have application that heavy writes with relative small rows - about
> > >> 10Kb,
> > >> current performance is 100000 rows in 580000 Milisec - 5.8 Milisec /
> > row.
> > >> Is there any way to improve this performance by some tuning / tweaking
> > >> HBase
> > >> or Hadoop ?
> > >>
> > >> Thank You and Best Regards.
> > >>
> > >>
> > >>
> > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message