From hbase-user-return-4501-apmail-hadoop-hbase-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 08 03:04:14 2009 Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 25142 invoked from network); 8 Jun 2009 03:04:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Jun 2009 03:04:14 -0000 Received: (qmail 68276 invoked by uid 500); 8 Jun 2009 03:04:25 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 68240 invoked by uid 500); 8 Jun 2009 03:04:25 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 68230 invoked by uid 99); 8 Jun 2009 03:04:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jun 2009 03:04:25 +0000 X-ASF-Spam-Status: No, hits=3.5 required=10.0 tests=HTML_MESSAGE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arber.research@gmail.com designates 209.85.217.215 as permitted sender) Received: from [209.85.217.215] (HELO mail-gx0-f215.google.com) (209.85.217.215) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jun 2009 03:04:15 +0000 Received: by gxk11 with SMTP id 11so1706858gxk.5 for ; Sun, 07 Jun 2009 20:03:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type; bh=sS371hFRD+E8IcQb9DwxRipKtM+PyNhvVvWW0Ncu7TQ=; b=wW9VprABtnw6aKTJPAp1BmMO+4RzTgodn0yyOPY576iftj4S9K1X66Uv5WYfKvAwJM ux2XWrlZHIWMQKhYY/R7SHwzCi0OikOf1GUx36icdqcfClYu0xEuzjO7ZU9begNr/DWk 6B6UOxiA6tT5jNFWgixzO5qx+56F4E7mfk7cY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; b=ZmkWEV++0DQzeA6IolY6UitKjAGN1l5YQjpamYNhjHuhzogSqqckMD1bbqUH4ZBNvF 1gtScR6X/G/P+X6ZlHce0mAYmEqWqDi8mL0otc4td/c7NxfZ+memiXtLcVVyyQ9+yL+I xqAF6TmP1RlylHacUAtdom8YtwncbKNg/fmd4= MIME-Version: 1.0 Sender: arber.research@gmail.com Received: by 10.151.74.2 with SMTP id b2mr11518990ybl.68.1244430234281; Sun, 07 Jun 2009 20:03:54 -0700 (PDT) In-Reply-To: <31a243e70906070856p573a4dfcs6709f09a575d13aa@mail.gmail.com> References: <382e1efc0906070621he7ffee1pfda911711975537d@mail.gmail.com> <31a243e70906070856p573a4dfcs6709f09a575d13aa@mail.gmail.com> Date: Mon, 8 Jun 2009 11:03:54 +0800 X-Google-Sender-Auth: 5cea16466de8986d Message-ID: <382e1efc0906072003i9ea9733h5ab51ce12a370d5@mail.gmail.com> Subject: Re: Again, HBase Data Lost!! From: Yabo-Arber Xu To: hbase-user Content-Type: multipart/alternative; boundary=001e680f1bb80fea09046bcd7e46 X-Virus-Checked: Checked by ClamAV on apache.org --001e680f1bb80fea09046bcd7e46 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi J-D, Thanks for your reply. We have a 10-node cluster installed with HBase/Hadoop 0.19.1. We wrote about 1.8M records into HBase, and it appeared fine. We can use 'count' on shell to get the number. But after a while ( not sure exactly when ) all nodes crashed right away. I checked the log and but did not find anything particular at that moment. When I restart it again, all the rows in the tables are lost but the tables are still there. I have difficulty to connect IRC channel now ( will try it later ). For now, I shared the master's log here, and the crash happens before the line "18:46:54 CST Starting master on A6". You would notice that the period from "2009-06-07 17:00:57" to "2009-06-07 18:46:54" is empty ( due to the crash). Thanks for your help! Best, Arber On Sun, Jun 7, 2009 at 11:56 PM, Jean-Daniel Cryans wrote: > Arber, > > From your email I have a hard time understanding exactly what happened > on your cluster. But, before asking any question, I have to say that I > never saw data just "disappearing" like this on the production cluster > I've been managing for the last year. > > So, what happened? Did you lose the region server holding .META or > -ROOT-? Was it after an importing job and all nodes crashed right > away? If any crash, do you know why it happened? > > Giving us those details (and more) will help us solve your problem. > You can also drop by the hbase IRC channel, there's always people > there happy to take a look at your logs and debugging goes much > faster. > > Thx, > > J-D > > On Sun, Jun 7, 2009 at 9:21 AM, Yabo-Arber Xu > wrote: > > Hi there, > > > > I had a hbase data lost couple of days ago due to the crash of HBase. At > > that time i asked on the list and was told that it may be due to the lost > of > > META data on the master ( not flushed into disk during crash ). Just > several > > days after, this happened to me again, this time all the data is gone, > while > > all the tables are still there. This time is on a 10-node hbase cluster. > > > > I checked the log but did not find anything strange. Could anybody shed a > > light on this? Given such stability, i really worried whether we can use > it > > in production phase. > > > > Best, > > Arber > > > > On Mon, May 25, 2009 at 10:25 AM, Yabo-Arber Xu < > arber.research@gmail.com>wrote: > > > >> Hi there, > >> > >> I had a single-node cluster up and running. And yesterday the node > crashed > >> for unknown reason, and when I restart it, everything appears to work > except > >> that all the tables are LOST( UI says that there is no user tables)!! I > >> checked the log file and didn't find any clue; while I found the tables > >> files are still there on HDFS. > >> > >> Anybody has any clue? > >> > >> It's quite urgent. Any help will be really appreciated. > >> > >> Best, > >> Arber > >> > >> > > > --001e680f1bb80fea09046bcd7e46--