hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: removing cells in minor compaction
Date Fri, 23 Jun 2017 17:37:07 GMT
On Wed, Jun 14, 2017 at 5:51 PM, Dave Latham <latham@davelink.net> wrote:

> What cells, if any, are removed during minor compactions?
>
> Cells that
> (a) are beyond the TTL?
> (b) are shadowed by a delete marker? (from the files compacted)
> (c) are shadowed by newer versions? (assuming numVersions configured < num
> versions of the cell found)
>


Compacting, we use scanners reading hfiles. Core difference between major
and main compaction is the scanType. If major (i.e. all files in the Store
are in the compaction set), then ScanType.COMPACT_DROP_DELETES else
ScanType.COMPACT_RETAIN_DELETES.

Logic on what to retain/delete is what makes for a Scan determined by rules
in ScanQueryMatcher (Actually, compactions use CompactionScanQueryMatcher,
a subclass whose only purpose is enforcing the scanType delete policy).

To answer your questions Dave:

a.) Yes (A Scan does not let you see Cells that are beyond TTL so on
compaction, they are not 'seen' and so not written out to the new compacted
file).
b.) No (See logic in CompactionScanQueryMatcher)
c.) Yes

St.Ack

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message