hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: HTable.delete(List) vs delete(Delete) and MR-related Q
Date Wed, 08 Dec 2010 18:54:08 GMT
On Wed, Dec 8, 2010 at 5:35 AM, Alex Baranau <alex.baranov.v@gmail.com> wrote:
> Hello,
>
> Please correct my following assumptions if they are wrong.
>
> HTable.delete(List) works a bit faster than delete(Delete) (if we need to
> delete multiple records) because the former causes single RPC request
> (single from client, but still chucks of deletes are sent to respective
> region servers). And it looks like deleting itself is the same in current
> implementation (as of 0.20.6 at least, what about trunk?): records deleted
> one-by-one.
>

In 0.20, yes.

In TRUNK, batching has been recast but in essence works the same way;
i.e. once the 'Action' gets to the server, we add the edit one at a
time.


> If it's true, then it might makes sense to accept List<Delete> (some
> Writable form of it) in TableOutputFormat (along with currently accepted Put
> and Delete): this could improve performance?
>

Yes.  It could.

Or, redo the TOF to take multiple Actions (in TRUNK):
http://people.apache.org/~stack/hbase-0.90.0-candidate-1/docs/xref/org/apache/hadoop/hbase/client/Action.html

Or fix our write buffer in HTable so it does Actions -- Puts and
Deletes -- rather than just Puts.


> Btw, it looks like with Puts we don't have this problem in case client-side
> write buffer is used: put(Put) and put(List) are equivalent.

Thats what I'd think... haven't tested it.

Well, almost:
> the first one creates extra ArrayList instance internally. Btw-2: it seems
> like this instance is created for the sake of a bit better code
> design/style, but doesn't look like it's worth it IMHO.
>

Smile.  Thanks for looking into this Alex.
St.Ack

Mime
View raw message