From user-return-18874-apmail-hbase-user-archive=hbase.apache.org@hbase.apache.org Mon May 2 17:04:18 2011 Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E87BB30DD for ; Mon, 2 May 2011 17:04:18 +0000 (UTC) Received: (qmail 85530 invoked by uid 500); 2 May 2011 17:04:17 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 85497 invoked by uid 500); 2 May 2011 17:04:17 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 85489 invoked by uid 99); 2 May 2011 17:04:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 May 2011 17:04:17 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.210.41] (HELO mail-pz0-f41.google.com) (209.85.210.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 May 2011 17:04:10 +0000 Received: by pzk4 with SMTP id 4so4148628pzk.14 for ; Mon, 02 May 2011 10:03:48 -0700 (PDT) Received: by 10.68.54.102 with SMTP id i6mr7804741pbp.111.1304355828543; Mon, 02 May 2011 10:03:48 -0700 (PDT) Received: from [192.168.144.105] (c-24-23-160-173.hsd1.ca.comcast.net [24.23.160.173]) by mx.google.com with ESMTPS id q10sm366953pbs.24.2011.05.02.10.03.46 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 02 May 2011 10:03:47 -0700 (PDT) Sender: Christopher Tarnas Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Hardware configuration From: Chris Tarnas In-Reply-To: Date: Mon, 2 May 2011 10:03:44 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <21827460-1F8A-46F8-8BE1-AFEAF7644481@email.com> References: <4DBA9B7B.1010609@1and1.ro> ,<4DBE54D1.3090107@1and1.ro> To: user@hbase.apache.org X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org What are some of the common pitfalls of having different configurations = for different nodes? Is the problem more management issues, making sure = each type of node has its own config (so a 12 core box has 12 mappers = and reduces, an 8 core has 8, drive layouts, etc) or are there problems = that configuration changes can't deal with?=20 thanks, -chris On May 2, 2011, at 6:57 AM, Michael Segel wrote: >=20 > Hi, >=20 > That's actually a really good question. > Unfortunately, the answer isn't really simple. >=20 > You're going to need to estimate your growth and you're going to need = to estimate your configuration. >=20 > Suppose I know that within 2 years, the amount of data that I want to = retain is going to be 1PB, with a 3x replication factor, I'll need at = least 3PB of disk. Assuming that I can fit 12x2TB drives in a node, I'll = need 125-150 machines. (There's some overhead for logging and OS) >=20 > Now this doesn't mean that I'll need to buy all of the machines today = and build out the cluster. > It means that I will need to figure out my machine room, (rack space, = power, etc...) and also hardware configuration. >=20 > You'll also need to plan out your hardware choices too. An example.. = you may want 10GBe on the switch but not at the data node. However = you're going to want to be able to expand your data nodes to be able to = add 10GBe cards. >=20 > The idea is that as I build out my cluster, all of the machines have = the same look and feel. So if you buy quad core CPUs and they are 2.2 = GHz but 6 months from now, you buy 2.6 GHz cpus, as long as they are 4 = core cpus, your cluster will look the same. >=20 > The point is that when you lay out your cluster to start with, you'll = need to plan ahead and keep things similar. Also you'll need to make = sure your NameNode has enough memory... >=20 > Having said that... Yahoo! has written a paper detailing MR2 (next = generation of map/reduce). As the M/R Job scheduler becomes more = intelligent about the types of jobs and types of hardware, the = consistency of hardware becomes less important.=20 >=20 > With respect to HBase, I suspect there to be a parallel evolution. >=20 > As to building out and replacing your cluster... if this is a = production environment, you'll have to think about DR and building out a = second cluster. So the cost of replacing clusters should also be = factored in when you budget for hardware. >=20 > Like I said, its not a simple answer and you have to approach each = instance separately and fine tune your cluster plans. >=20 > HTH >=20 > -Mike >=20 >=20 > ---------------------------------------- >> Date: Mon, 2 May 2011 09:53:05 +0300 >> From: iulia.zidaru@1and1.ro >> To: user@hbase.apache.org >> CC: stack@duboce.net >> Subject: Re: Hardware configuration >>=20 >> Thank you both. How would you estimate really big clusters, with >> hundreds of nodes? Requirements might change in time and replacing an >> entire cluster seems not the best solution... >>=20 >>=20 >>=20 >> On 04/29/2011 07:08 PM, Stack wrote: >>> I agree with Michel Segel. Distributed computing is hard enough. >>> There is no need to add extra complexity. >>>=20 >>> St.Ack >>>=20 >>> On Fri, Apr 29, 2011 at 4:05 AM, Iulia Zidaru wrote: >>>> Hi, >>>> I'm wondering if having a cluster with different machines in terms = of CPU, >>>> RAM and disk space would be a big issue for HBase. For example, = machines >>>> with 12GBs RAM and machines with 48GBs. We suppose that we use them = at full >>>> capacity. What problems we might encounter if having this kind of >>>> configuration? >>>> Thank you, >>>> Iulia >>>>=20 >>>>=20 >>=20 >>=20 >> -- >> Iulia Zidaru >> Java Developer >>=20 >> 1&1 Internet AG - Bucharest/Romania - Web Components Romania >> 18 Mircea Eliade St >> Sect 1, Bucharest >> RO Bucharest, 012015 >> iulia.zidaru@1and1.ro >> 0040 31 223 9153 >>=20 >>=20 >>=20 > =20