hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Urso <antho...@cs.ucla.edu>
Subject High throughput input, low latency output?
Date Fri, 07 Oct 2011 19:43:06 GMT
We have a use case that will require a ten to twenty EC2 node HBase
cluster to take several hundred million rows of input from a larger
number of EMR instances in daily bursts, and then serve those rows via
low latency random reads, say on the order of 300 or so rows per
second. Before we start coding, I thought it best to ask the experts
for their advice.

1) Is this something that HBase will be able to handle gracefully?
2) Does anyone have any pointers on how to tune HBase for performance
and stability under this load?
3) Would HBase perform better under this sort of load on twelve large
EC2 instances, six xlarge or three xxlarge?


View raw message