hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: How often should we reboot hbase cluster---looking for best practice
Date Tue, 29 May 2012 05:54:38 GMT
Richard,

Good to know. Could you also comment on how different your internal
tool is from the new HBCK
(https://issues.apache.org/jira/browse/HBASE-5128 as Ted pointed out
earlier)?

Would also be good if you can share your logs with us for the latter
"every week" cases.

On Mon, May 28, 2012 at 11:50 PM, Xu, Richard <richard.xu@citi.com> wrote:
> Overlapping regions (https://issues.apache.org/jira/browse/HBASE-4238) do not show up
very often. We know that it is fixed after 0.90.5, but instead of upgrading the production
hbase cluster, we have an internal tool (call Hbase APIs) to fix it.
>
> Regions out of sync (between META and HDFS) and Inactive regions appear more often ---
we can see them every week; again, our internal tool handles these cases as well.
>
>
> -----Original Message-----
> From: Kevin O'dell [mailto:kevin.odell@cloudera.com]
> Sent: Monday, May 28, 2012 1:45 PM
> To: user@hbase.apache.org
> Subject: Re: How often should we reboot hbase cluster---looking for best practice
>
> +1 what Harsh said.  It sounds to me like you are putting a bandaid on a
> flesh wound.  We should do further analysis and get your cluster to a
> stable state rather than repairing it weekly.  Can you also describe in
> more detail everything you are running to do the repair in 90.4 fixing an
> overlapping region is not an easy task by any means.
>
> On Mon, May 28, 2012 at 9:14 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> If you're talking of "hbck -fix", then no you don't need to restart
>> HBase after it resolves your issues.
>>
>> Would be good to investigate/know what causes such frequent
>> inconsistencies in your cluster though. Its not normal for
>> inconsistencies to appear regularly every week. Do your region servers
>> often crash weekly, for instance?
>>
>> On Mon, May 28, 2012 at 9:35 PM, Xu, Richard <richard.xu@citi.com> wrote:
>> > Hi folks,
>> >
>> > It is more like an operation question.
>> >
>> > Hbase version is 0.90.4, we have a weekly job to fix known issues such
>> as META table out of sync, inactive/overlapping/dangling regions while
>> hbase is online.
>> >
>> > Should we restart the hbase cluster right after the fix? What is the
>> best practice here?
>> >
>> > Thanks in advance!
>> >
>> > Richard
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> Kevin O'Dell
> Customer Operations Engineer, Cloudera



-- 
Harsh J

Mime
View raw message