hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From suresh babu <bigdatac...@gmail.com>
Subject Re: Regarding Hardware configuration for HBase cluster
Date Thu, 06 Feb 2014 17:33:29 GMT
refreshing the thread,

Can you please  suggest any inputs for the hardware configuration(for the
below mentioned use case).




On Wed, Feb 5, 2014 at 10:31 AM, suresh babu <bigdatacslt@gmail.com> wrote:

> Please find the data requirements for our use case below :
>
> Raw data processing
> ----------------------------------
> 1. Data is populated into hdfs , after etl around 3 billion puts per day
> in to hbase
>
> 2. Oldest data after X days to be deleted from hbase
>
> Aggregates processing
> ----------------------------------
> 3 billion reads per day ... Large scan or reads
>
> KV size around 1 KB Daily Processing, raw and aggregates, via M/R jobs
> Hive queries in future, but not of immediate focus
> On Feb 5, 2014 12:48 AM, "Vladimir Rodionov" <vrodionov@carrieriq.com>
> wrote:
>
>> Yes,
>>
>> 1. What is the expected avg and peak load in writes/updates/deletes/reads?
>> 2. What is the average size of a KV?
>> 3. Reads/small scans/medium/large scan %%
>> 4. Do you plan M/R jobs, Hive query?
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodionov@carrieriq.com
>>
>> ________________________________________
>> From: Nick Xie [nick.xie.hadoop@gmail.com]
>> Sent: Tuesday, February 04, 2014 10:02 AM
>> To: user@hbase.apache.org
>> Subject: Re: Regarding Hardware configuration for HBase cluster
>>
>> I guess you'd better describe a little bit more about your applications.
>> Does the data increase over the time at all?
>>
>> Nick
>>
>>
>> On Tue, Feb 4, 2014 at 5:22 AM, suresh babu <bigdatacslt@gmail.com>
>> wrote:
>>
>> > Hi folks,
>> >
>> > We are trying to setup HBase cluster for the following requirement:
>> >
>> > We have to maintain data of size around 800TB,
>> >
>> > For the above requirement,please suggest me the best hardware
>> configuration
>> > details like
>> >
>> > 1)how many disks to consider for machine and the  capacity of disks ,for
>> > example, 16/24 disks per node with 1/2TB capacity per each disk
>> >
>> > 2) which compression method is suited for production environment ,
>> space is
>> > not a major limitation , but speed is of prime concern for my use case
>> >
>> > 3) how many CPU Cores should be configured for each node/machine ?  Or
>> > ideal ratio of number of cores to the number of disks,for example
>> > 1core/1disk ?
>> >
>> > Regards,
>> > Kaushik
>> >
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
>> disclosure or distribution of this message or its attachments, in any form,
>> is strictly prohibited.  If you have received this message in error, please
>> immediately notify the sender and/or Notifications@carrieriq.com and
>> delete or destroy any copy of this message and its attachments.
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message