From 冯宏华 <fenghong...@xiaomi.com>
Subject 答复: Dropping a very large table
Date Tue, 10 Sep 2013 04:22:07 GMT
seems no very simple way to do this. not sure if close/unassign regions gradually via script
before dropping can help a little.

the pain derives from current master assignment design which relies on ZK to track the assign/split
progress/status, and for creating/dropping/restarting tables with very big number of regions
the ZK can be overwhelmed by very heavy creation/update/deletion operations at almost the
same time. 

I wonder this is a kind of abuse of ZK in that by design ZK is expected to store small amount
of meta/config data with with sparse access, not to store such huge(if region number reach
20K-100K) amount of data/nodes with intensive access.

Why not store the assignment progress/status info in another system table, as META table,
rather than in ZK?
发件人: Michael Webster [michael.webster@bronto.com]
发送时间: 2013年9月10日 7:36
收件人: user@hbase.apache.org
主题: Dropping a very large table


I have a very large HBase table running on 0.90, large meaning >20K regions
with a max region size of 1GB. This table is legacy and can be dropped, but
 we aren't sure what impact disabling/dropping that large of a table will
have on our cluster.

We are using dropAsync and polling HTable#isEnabled instead of the standard
shell disable command to avoid a timeout during disable like in
Is there any risk to overwhelming zookeeper or the master with region
closed events during the disable, or would it be comparable to what happens
during a cluster restart when RS closes out regions?  Additionally, are
there any concerns with wiping out that much data in HDFS at once during
the drop?

Thank you in advance,
