hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Major compaction cannot remove deleted rows until the region is split. Strange!
Date Thu, 09 Jun 2016 19:08:23 GMT
Tian-ying:

Can you try the suggestion below to see if it helps?

Do you see the lease expired logs?

St.Ack


On Thu, Jun 2, 2016 at 7:03 PM, hao zhang <zghaobac@gmail.com> wrote:

> Hi, Stack
>
>   We found this in our production cluster, too.  I take a look about the
> code and found one case which will make the major compact not work.
>
>   1. put some rows
>   2. scan
>   3. delete
>   4. scanner didn't close rightly, it will keep the  the read point in
> region's scannerReadPoints.
>   5. major compact. It can't work because we have a scanner which have
> small read point than the delete.
>
>   But if move the region to new rs, the scannerReadPoints will update to
> the biggest memstoreTs which form sotrefile. So major compact will work.
>   I thought the try...catch module in Scanner.leaseExpired() method needs a
> finally module to close the region scanner rightly.
>
>     public void leaseExpired() {
>       RegionScannerHolder rsh = scanners.remove(this.scannerName);
>       if (rsh != null) {
>         RegionScanner s = rsh.s;
>         LOG.info("Scanner " + this.scannerName + " lease expired on region
> "
>           + s.getRegionInfo().getRegionNameAsString());
>         try {
>           Region region =
> regionServer.getRegion(s.getRegionInfo().getRegionName());
>           if (region != null && region.getCoprocessorHost() != null) {
>             region.getCoprocessorHost().preScannerClose(s);
>           }
>           s.close();
>           if (region != null && region.getCoprocessorHost() != null) {
>             region.getCoprocessorHost().postScannerClose(s);
>           }
>         } catch (IOException e) {
>           LOG.error("Closing scanner for "
>             + s.getRegionInfo().getRegionNameAsString(), e);
>         }
>       } else {
>         LOG.warn("Scanner " + this.scannerName + " lease expired, but no
> related" +
>           " scanner found, hence no chance to close that related
> scanner!");
>       }
>     }
>
>
> 2016-06-02 2:50 GMT+08:00 Stack <stack@duboce.net>:
>
> > On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <tychang@gmail.com>
> wrote:
> >
> > > Hi, Stack
> > >
> > > After moving the region and issue a major compact on that region, its
> > size
> > > shrink from 99G down to 24G. So it looks like the region is in a bad
> > state
> > > that cannot recover, close/open it fixed the issue. And from the region
> > > size metric graph, we can see major compaction stop working  since
> March
> > > 31, so some bug that caused region enter into bad state...
> Unfortunately,
> > > we don't have DEBUG enabled and that is the last region that has the
> > issue,
> > > it is hard to figure out what is the bug that caused the bad state...
> > >
> > >
> > Interesting. So moving it to another RS make it major-compactable? That
> > would seem to indicate some state kept in the RS memory is preventing the
> > major compaction running. Is moving the region a workaround for you until
> > we figure what it is Tian-Ying?
> >
> > St.
> >
> >
> >
> > > Thanks
> > > Tian-Ying
> > >
> > > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <tychang@gmail.com>
> > wrote:
> > >
> > > > Hi, Stack
> > > >
> > > > Based on the log, the major compaction was run, and it took 5+ hours.
> > > And
> > > > I also manually run major_compact from hbase shell explicitly to
> > verify.
> > > >
> > > > I just moved the region to a different RS and issued a major_compact
> on
> > > > that region again, let me see if the major compaction can succeed and
> > > will
> > > > report back.
> > > >
> > > > Thanks
> > > > Tian-Ying
> > > >
> > > > On Sun, May 29, 2016 at 4:35 PM, Stack <stack@duboce.net> wrote:
> > > >
> > > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang <tychang@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Yes, it is 94.26.  By a quick glance, I didn't  see any put that
> is
> > > >> older
> > > >> > than the delete marker's TS, which could go as far as about couple
> > > weeks
> > > >> > ago since major compaction on it for long time seems.
> > > >> >
> > > >> Also it is really strange that if the region is split, then seems
> > > >> > everything is working as expected. Also we noticed, the same
> region
> > > >> > replicated at the slave side is totally normal, i.e. at 20+G....
> > > >> >
> > > >> >
> > > >> If you move the region to another server, does that work?
> > > >>
> > > >> Looking in 0.94 codebase, I see this in Compactor#compact
> > > >>
> > > >>
> > > >>       // For major compactions calculate the earliest put timestamp
> > > >>
> > > >>       // of all involved storefiles. This is used to remove
> > > >>
> > > >>       // family delete marker during the compaction.
> > > >>
> > > >>       if (majorCompaction) {
> > > >>
> > > >>         tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS);
> > > >>
> > > >>         if (tmp == null) {
> > > >>
> > > >>           // There's a file with no information, must be an old one
> > > >>
> > > >>           // assume we have very old puts
> > > >>
> > > >>           earliestPutTs = HConstants.OLDEST_TIMESTAMP;
> > > >>
> > > >>         } else {
> > > >>
> > > >>           earliestPutTs = Math.min(earliestPutTs,
> Bytes.toLong(tmp));
> > > >>
> > > >>         }
> > > >>
> > > >>       }
> > > >>
> > > >>
> > > >> The above is followed by this log line:
> > > >>
> > > >>
> > > >>       if (LOG.isDebugEnabled()) {
> > > >>
> > > >>         LOG.debug("Compacting " + file +
> > > >>
> > > >>           ", keycount=" + keyCount +
> > > >>
> > > >>           ", bloomtype=" + r.getBloomFilterType().toString() +
> > > >>
> > > >>           ", size=" + StringUtils.humanReadableInt(r.length()) +
> > > >>
> > > >>           ", encoding=" + r.getHFileReader().getEncodingOnDisk() +
> > > >>
> > > >>           (majorCompaction? ", earliestPutTs=" + earliestPutTs:
> ""));
> > > >>
> > > >>       }
> > > >>
> > > >> This prints out earliestPutTs. You see that in the logs?  You
> running
> > > with
> > > >> DEBUG? Does the earliest put ts preclude our dropping delete family?
> > > >>
> > > >>
> > > >> Looking more in code, we retain deletes in following circumstances:
> > > >>
> > > >>
> > > >>     this.retainDeletesInOutput = scanType == ScanType.MINOR_COMPACT
> ||
> > > >> scan
> > > >> .isRaw();
> > > >>
> > > >>
> > > >> So, for sure we are running major compaction?
> > > >>
> > > >> Otherwise, have to dig in a bit more here.. This stuff is a little
> > > >> involved.
> > > >> St.Ack
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <stack@duboce.net>
wrote:
> > > >> >
> > > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang <
> > tychang@gmail.com>
> > > >> > wrote:
> > > >> > >
> > > >> > > > Hi,
> > > >> > > >
> > > >> > > > We saw a very strange case in one of our production
cluster. A
> > > >> couple
> > > >> > > > regions cannot get their deleted rows or delete marker
removed
> > > even
> > > >> > after
> > > >> > > > major compaction. However when the region triggered
split (we
> > set
> > > >> 100G
> > > >> > > for
> > > >> > > > auto split), the deletion worked. The 100G region becomes
two
> > 10G
> > > >> > > daughter
> > > >> > > > regions, and all the delete marker are gone.
> > > >> > > >
> > > >> > > > Also, the same region in the slave cluster (through
> replication)
> > > >> have
> > > >> > > > normal size at about 20+G.
> > > >> > > >
> > > >> > > > BTW, the delete marker in the regions are mostly deleteFamily
> if
> > > it
> > > >> > > > matters.
> > > >> > > >
> > > >> > > > This is really weird. Anyone has any clue for this
strange
> > > behavior?
> > > >> > > >
> > > >> > > > Thanks
> > > >> > > > Tian-Ying
> > > >> > > >
> > > >> > > > These 0.94 Tian-Ying?
> > > >> > >
> > > >> > > It looks like the DeleteFamily is retained only; do you
see
> > > incidence
> > > >> > where
> > > >> > > there may have been versions older than the DeleteFamily
that
> are
> > > also
> > > >> > > retained post-major-compaction?
> > > >> > >
> > > >> > > St.Ack
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > > A snippet of the HFile generated by the major compaction:
> > > >> > > >
> > > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts=
> 2292870047
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts=
> 2292869794
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts=
> 2289446916
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts=
> 2288670451
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts=
> 2287911443
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts=
> 2287213792
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts=
> 2286488738
> > > >> > > > V:
> > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts=
> 2285778927
> > > >> > > > V:
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message