hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dalton <mwdal...@gmail.com>
Subject HTable checkAndPut equivalent for Deletes
Date Fri, 30 Apr 2010 21:51:14 GMT
Hi everyone,

I have a quick question -- I'd like to do a simple atomic check-and-Delete
for a row. For Put operations, HTable.checkAndPut appears to allow a simple
atomic compare-and-update, which is great. However, there doesn't seem to be
an equivalent function for deletes.

I was thinking about approximating this by writing NULL or zero-length byte
array as a value in a Put to emulating deleting a cell. It appears that
checkAndPut already treats a zero-length array as equivalent to a
non-existent value when performing its comparison (before committing the
Put). The only drawback I can see to this is that I never truly remove rows,
I just end up with 'dead' rows containing empty byte arrays, so I'd imagine
that every N hours or days I would need to garbage collect these empty rows
somehow (which brings us back full circle to the issue of how to atomically
check and delete a row).

The only real alternative I can see for doing this would be to emulate
checkAndDelete by using RowLocks to lock the row, perform a Get, verify that
the row contains the expected value, then perform a delete, and then unlock
the row itself. Correct me if I'm wrong, but this should definitely emulate
the semantics of atomic compare-and-Delete (assuming the compare and delete
operate on the same row and use the RowLock). However, I'm not sure what the
performance would be for using RowLocks to emulate checkAndDelete on the
client side vs. using Put+checkAndPut to emulate checkAndDelete on the
server side. Does anyone have any advice on this issue, or any idea what the
relative tradeoffs are?

In the long run, it seems to me that the clearly optimal solution would be
to have a checkAndDelete function in HTable, and I'd be interesting in
adding this functionality if no one else is currently working on it. Is that
something that would be interesting to integrate and worth committing back
to mainline? Are there any hidden pitfalls I should be aware of, or some
technical/design reason for why this API call doesn't already exist? If not,
I'll take a hard look at the delete and checkAndPut code in the regionserver
and once sometime soon open an issue in JIRA and start coding.

Best regards,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message