From hbase-user-return-688-apmail-hadoop-hbase-user-archive=hadoop.apache.org@hadoop.apache.org Sun Jun 08 09:35:28 2008 Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@locus.apache.org Received: (qmail 34019 invoked from network); 8 Jun 2008 09:35:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Jun 2008 09:35:28 -0000 Received: (qmail 10929 invoked by uid 500); 8 Jun 2008 09:35:31 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 10909 invoked by uid 500); 8 Jun 2008 09:35:31 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 10898 invoked by uid 99); 8 Jun 2008 09:35:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2008 02:35:31 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of grafan@gmail.com designates 74.125.46.31 as permitted sender) Received: from [74.125.46.31] (HELO yw-out-2324.google.com) (74.125.46.31) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2008 09:34:42 +0000 Received: by yw-out-2324.google.com with SMTP id 9so1273542ywe.29 for ; Sun, 08 Jun 2008 02:34:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=Ozcf9Wcvfla4ZmuycR+Nr+Q8sKE4C6MQx9AjGAwkuaw=; b=Bfx5ghCKOSGX9BFeMA39LAO9IUISCKkMFBD5JOuGXa+2+n9O3D4R6DHi5NVeRPepP+ UAWl2kvqe6MFfiRgJFQFDUZ8Mg+HG7s+ZsyrCpKxWUX/Qjo4vp9OY7hZdwuCg3gsRl1U YTRRY45BUOc14SJzAkTI6ETIJlvhxbWw2FfJs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=LD8pMSLKegX9mrxAb5C725In9AFv+msNVx5N0P79nKZvup6pMpYpH/oqCVoYRuve+1 Nu1Z0DSXwMltzXPWiyVY8K6cZiLAWk+dnwOGxxBkez8q/9r0wLUDrmUadm4obsKBER1k u7jaVdUoSb/NGqQUH2pSndurzT3YUI27YQZjs= Received: by 10.150.11.6 with SMTP id 6mr3944121ybk.11.1212917699160; Sun, 08 Jun 2008 02:34:59 -0700 (PDT) Received: by 10.150.206.2 with HTTP; Sun, 8 Jun 2008 02:34:59 -0700 (PDT) Message-ID: <6eb82e0806080234u6fd30e3au88c2bd7f7262b4a7@mail.gmail.com> Date: Sun, 8 Jun 2008 17:34:59 +0800 From: "Rong-en Fan" To: hbase-user@hadoop.apache.org Subject: Re: re-balance regions based on avgLoad In-Reply-To: <6eb82e0806080226j5f1c173cr7afabfc7915ed917@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <6eb82e0806071953h7591a7a3p21e3c8f900c91cf0@mail.gmail.com> <6eb82e0806071955j2a8f2bebub49a8a5aeac56532@mail.gmail.com> <6eb82e0806080226j5f1c173cr7afabfc7915ed917@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org On Sun, Jun 8, 2008 at 5:26 PM, Rong-en Fan wrote: > On Sun, Jun 8, 2008 at 11:19 AM, Bryan Duxbury wrote: >> We have an issue open tracking the fact that the rebalancing code has a >> tendency to oscillate. I haven't had the time to look at it for a while. It >> definitely needs some attention. >> >> The point of checking whether the avgLoad is > 2.0 is to ensure that no >> rebalancing occurs before there are more regions than there are servers. It >> doesn't do anything in that case. > > Aha, yes.... I mis-read the code... I will watch HBASE-71. Sorry, it should be 615. > > Sorry for the noise. > > Regards, > Rong-En Fan > >> >> -Bryan >> >> On Jun 7, 2008, at 7:55 PM, Rong-en Fan wrote: >> >>> On Sun, Jun 8, 2008 at 10:53 AM, Rong-en Fan wrote: >>>> >>>> I'm playing with latest hbase trunk and noticed there is a >>>> region close-then-assign looping. >>>> >>>> I have 3 region servers and total regions are about 108, and >>>> the avgLoad is 36.0, then this code in RegionManager.java: >>>> >>>> if (regionsToAssign.size() == 0) { >>>> // There are no regions waiting to be assigned. This is an >>>> opportunity >>>> // for us to check if this server is overloaded. >>>> double avgLoad = master.serverManager.getAverageLoad(); >>>> if (avgLoad > 2.0 && thisServersLoad.getNumberOfRegions() > >>>> avgLoad) { >>>> >>>> If I understand correctly, when there is no outstanding unassigned >>>> regions, then RegionManager tries to check whether a region server >>>> is overloaded by the # of loaded regions on this region server. >>>> Then, it seems to me that avgLoad > 2.0 is quite unrealistic under >>>> current calculation of "avgLoad". >>> >>> Forget to mention, under my situation, all 3 boxes are kicking regions >>> among them as all loaded regions > avgLoad... even after assign >>> some to others. >>> >>> Regards, >>> Rong-En Fan >>> >>>> Isn't the better way to consider system load or based on # of loaded >>>> regions and # of requests? >>>> >>>> Thanks, >>>> Rong-En Fan >>>> >> >> >