hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: High variance in results for hbase benchmarking
Date Fri, 04 Mar 2011 08:28:11 GMT
> > Since we are using EC2 Large instances, it seems
> > unlikely that network or some other virtualization
> > related resources crunch are affecting our
> > performance measurement.

Your assumptions are wrong. It seems only c1.xlarge and m2.4xlarge may be assigned dedicated
hardware. Reference: http://huanliu.wordpress.com/2010/06/14/amazons-physical-hardware-and-ec2-compute-unit/
 Their shared disk storage (instance-store) would still be impacted by neighbors.

I think the only way you will approach consistent results is if you use the cluster compute
instances (cc1.4xlarge). These are a completely different architecture, HVM instead of PVM,
dedicated 10GigE network, dedicated physical hosts, etc.

With other instance types I see large variance from day to day even hour to hour. In short,
EC2 is useless for performance benchmarking. It's very handy for a lot of other things though,
like functional or smoke testing.

For additional information see: http://www.comp.nus.edu.sg/~vldb2010/proceedings/files/papers/E02.pdf
. 

   - Andy


--- On Thu, 3/3/11, Gary Helmling <ghelmling@gmail.com> wrote:

> From: Gary Helmling <ghelmling@gmail.com>
> Subject: Re: High variance in results for hbase benchmarking
> To: user@hbase.apache.org
> Cc: "Aditya Sharma" <adityadsharma@gmail.com>
> Date: Thursday, March 3, 2011, 11:37 PM
> On Thu, Mar 3, 2011 at 10:19 PM,
> Aditya Sharma <adityadsharma@gmail.com>wrote:
> 
> >
> > Since we are using EC2 Large instances, it seems
> > unlikely that network or some other virtualization
> > related resources crunch are affecting our
> > performance measurement.
> >
> >
> You are guaranteed to see large variance in results when
> benchmarking on EC2.  Welcome to the oversubscribed public
> cloud!  You can run the same test twice with the same
> instances and still see massive differences.  You should
> expect at least 25% variance between test runs (in practice
> I've seen as much as 100% variance myself).
> 
> Two nodes is a very small cluster to be benchmarking on. 
> The minimum cluster size is typically recommended as something
> like 1 master node (NN, JT and HBase Master) + 3 slaves (DN,
> TT and Region Server).  But HBase really works best when you
> start to approach 10 slaves or
> more.
[...]



      

Mime
View raw message