hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ganesh Viswanathan <gan...@gmail.com>
Subject Re: Dropping a very large table - 75million rows
Date Fri, 03 Feb 2017 21:02:16 GMT
Thanks Josh.

Also, I realized I didnt give the full size of the table. It takes in
~75million rows per minute and stores for 15days. So around 1.125billion
rows total.

On Fri, Feb 3, 2017 at 12:52 PM, Josh Elser <elserj@apache.org> wrote:

> I think you are worried about nothing, Ganesh.
>
> If you want to drop (delete) the entire table, just disable and drop it
> from the shell. This operation is not going to have a significant impact on
> your cluster (save a few flush'es). This would only happen if you have had
> recent writes to this table (which seems unlikely if you want to drop it).
>
>
> Ganesh Viswanathan wrote:
>
>> Hello,
>>
>> I need to drop an old HBase table that is quite large. It has anywhere
>> between 2million and 70million datapoints. I turned off the count after it
>> ran on the HBase shell for half a day. I have 4 other tables that have
>> around 75million rows in total and also take heavy PUT and GET traffic.
>>
>> What is the best practice for disabling and dropping such a large table in
>> HBase so that I have minimal impact on the rest of the cluster?
>> 1) I hear there are ways to disable (and drop?) specific regions? Would
>> that work?
>> 2) Should I scan and delete a few rows at a time until the size becomes
>> manageable and then disable/drop the table?
>>    If so, what is a good number of rows to delete at a time, should I run
>> a
>> major compaction after these row deletes on specific regions, and what is
>> a
>> good sized table that can be easily dropped (and has been validated)
>> without causing issues on the larger cluster.
>>
>>
>> Thanks!
>> Ganesh
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message