hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Major compaction cannot remove deleted rows until the region is split. Strange!
Date Mon, 13 Jun 2016 17:35:51 GMT
Thanks for filing HBASE-16012 Guanghao Zhang...
St.Ack

On Thu, Jun 2, 2016 at 10:06 PM, Anoop John <anoop.hbase@gmail.com> wrote:

> So u feel the call  s.close() would have created some Exception and
> the removal of entry from scannerReadPoints would not have happened.
> hmm.. sounds possible
>
> public synchronized void close() {
>       if (storeHeap != null) {
>         storeHeap.close();
>         storeHeap = null;
>       }
>       if (joinedHeap != null) {
>         joinedHeap.close();
>         joinedHeap = null;
>       }
>       // no need to synchronize here.
>       scannerReadPoints.remove(this);
>       this.filterClosed = true;
>     }
>
> We may need to change this close impl and make sure the removal from
> scannerReadPoints happen in a finally block.
>
> -Anoop-
>
>
> On Fri, Jun 3, 2016 at 7:33 AM, hao zhang <zghaobac@gmail.com> wrote:
> > Hi, Stack
> >
> >   We found this in our production cluster, too.  I take a look about the
> > code and found one case which will make the major compact not work.
> >
> >   1. put some rows
> >   2. scan
> >   3. delete
> >   4. scanner didn't close rightly, it will keep the  the read point in
> > region's scannerReadPoints.
> >   5. major compact. It can't work because we have a scanner which have
> > small read point than the delete.
> >
> >   But if move the region to new rs, the scannerReadPoints will update to
> > the biggest memstoreTs which form sotrefile. So major compact will work.
> >   I thought the try...catch module in Scanner.leaseExpired() method
> needs a
> > finally module to close the region scanner rightly.
> >
> >     public void leaseExpired() {
> >       RegionScannerHolder rsh = scanners.remove(this.scannerName);
> >       if (rsh != null) {
> >         RegionScanner s = rsh.s;
> >         LOG.info("Scanner " + this.scannerName + " lease expired on
> region "
> >           + s.getRegionInfo().getRegionNameAsString());
> >         try {
> >           Region region =
> > regionServer.getRegion(s.getRegionInfo().getRegionName());
> >           if (region != null && region.getCoprocessorHost() != null) {
> >             region.getCoprocessorHost().preScannerClose(s);
> >           }
> >           s.close();
> >           if (region != null && region.getCoprocessorHost() != null) {
> >             region.getCoprocessorHost().postScannerClose(s);
> >           }
> >         } catch (IOException e) {
> >           LOG.error("Closing scanner for "
> >             + s.getRegionInfo().getRegionNameAsString(), e);
> >         }
> >       } else {
> >         LOG.warn("Scanner " + this.scannerName + " lease expired, but no
> > related" +
> >           " scanner found, hence no chance to close that related
> scanner!");
> >       }
> >     }
> >
> >
> > 2016-06-02 2:50 GMT+08:00 Stack <stack@duboce.net>:
> >
> >> On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <tychang@gmail.com>
> wrote:
> >>
> >> > Hi, Stack
> >> >
> >> > After moving the region and issue a major compact on that region, its
> >> size
> >> > shrink from 99G down to 24G. So it looks like the region is in a bad
> >> state
> >> > that cannot recover, close/open it fixed the issue. And from the
> region
> >> > size metric graph, we can see major compaction stop working  since
> March
> >> > 31, so some bug that caused region enter into bad state...
> Unfortunately,
> >> > we don't have DEBUG enabled and that is the last region that has the
> >> issue,
> >> > it is hard to figure out what is the bug that caused the bad state...
> >> >
> >> >
> >> Interesting. So moving it to another RS make it major-compactable? That
> >> would seem to indicate some state kept in the RS memory is preventing
> the
> >> major compaction running. Is moving the region a workaround for you
> until
> >> we figure what it is Tian-Ying?
> >>
> >> St.
> >>
> >>
> >>
> >> > Thanks
> >> > Tian-Ying
> >> >
> >> > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <tychang@gmail.com>
> >> wrote:
> >> >
> >> > > Hi, Stack
> >> > >
> >> > > Based on the log, the major compaction was run, and it took 5+
> hours.
> >> > And
> >> > > I also manually run major_compact from hbase shell explicitly to
> >> verify.
> >> > >
> >> > > I just moved the region to a different RS and issued a
> major_compact on
> >> > > that region again, let me see if the major compaction can succeed
> and
> >> > will
> >> > > report back.
> >> > >
> >> > > Thanks
> >> > > Tian-Ying
> >> > >
> >> > > On Sun, May 29, 2016 at 4:35 PM, Stack <stack@duboce.net> wrote:
> >> > >
> >> > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang <tychang@gmail.com
> >
> >> > >> wrote:
> >> > >>
> >> > >> > Yes, it is 94.26.  By a quick glance, I didn't  see any put
that
> is
> >> > >> older
> >> > >> > than the delete marker's TS, which could go as far as about
> couple
> >> > weeks
> >> > >> > ago since major compaction on it for long time seems.
> >> > >> >
> >> > >> Also it is really strange that if the region is split, then seems
> >> > >> > everything is working as expected. Also we noticed, the same
> region
> >> > >> > replicated at the slave side is totally normal, i.e. at 20+G....
> >> > >> >
> >> > >> >
> >> > >> If you move the region to another server, does that work?
> >> > >>
> >> > >> Looking in 0.94 codebase, I see this in Compactor#compact
> >> > >>
> >> > >>
> >> > >>       // For major compactions calculate the earliest put timestamp
> >> > >>
> >> > >>       // of all involved storefiles. This is used to remove
> >> > >>
> >> > >>       // family delete marker during the compaction.
> >> > >>
> >> > >>       if (majorCompaction) {
> >> > >>
> >> > >>         tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS);
> >> > >>
> >> > >>         if (tmp == null) {
> >> > >>
> >> > >>           // There's a file with no information, must be an old
one
> >> > >>
> >> > >>           // assume we have very old puts
> >> > >>
> >> > >>           earliestPutTs = HConstants.OLDEST_TIMESTAMP;
> >> > >>
> >> > >>         } else {
> >> > >>
> >> > >>           earliestPutTs = Math.min(earliestPutTs,
> Bytes.toLong(tmp));
> >> > >>
> >> > >>         }
> >> > >>
> >> > >>       }
> >> > >>
> >> > >>
> >> > >> The above is followed by this log line:
> >> > >>
> >> > >>
> >> > >>       if (LOG.isDebugEnabled()) {
> >> > >>
> >> > >>         LOG.debug("Compacting " + file +
> >> > >>
> >> > >>           ", keycount=" + keyCount +
> >> > >>
> >> > >>           ", bloomtype=" + r.getBloomFilterType().toString() +
> >> > >>
> >> > >>           ", size=" + StringUtils.humanReadableInt(r.length())
+
> >> > >>
> >> > >>           ", encoding=" + r.getHFileReader().getEncodingOnDisk()
+
> >> > >>
> >> > >>           (majorCompaction? ", earliestPutTs=" + earliestPutTs:
> ""));
> >> > >>
> >> > >>       }
> >> > >>
> >> > >> This prints out earliestPutTs. You see that in the logs?  You
> running
> >> > with
> >> > >> DEBUG? Does the earliest put ts preclude our dropping delete
> family?
> >> > >>
> >> > >>
> >> > >> Looking more in code, we retain deletes in following circumstances:
> >> > >>
> >> > >>
> >> > >>     this.retainDeletesInOutput = scanType ==
> ScanType.MINOR_COMPACT ||
> >> > >> scan
> >> > >> .isRaw();
> >> > >>
> >> > >>
> >> > >> So, for sure we are running major compaction?
> >> > >>
> >> > >> Otherwise, have to dig in a bit more here.. This stuff is a little
> >> > >> involved.
> >> > >> St.Ack
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <stack@duboce.net>
wrote:
> >> > >> >
> >> > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang <
> >> tychang@gmail.com>
> >> > >> > wrote:
> >> > >> > >
> >> > >> > > > Hi,
> >> > >> > > >
> >> > >> > > > We saw a very strange case in one of our production
cluster.
> A
> >> > >> couple
> >> > >> > > > regions cannot get their deleted rows or delete
marker
> removed
> >> > even
> >> > >> > after
> >> > >> > > > major compaction. However when the region triggered
split (we
> >> set
> >> > >> 100G
> >> > >> > > for
> >> > >> > > > auto split), the deletion worked. The 100G region
becomes two
> >> 10G
> >> > >> > > daughter
> >> > >> > > > regions, and all the delete marker are gone.
> >> > >> > > >
> >> > >> > > > Also, the same region in the slave cluster (through
> replication)
> >> > >> have
> >> > >> > > > normal size at about 20+G.
> >> > >> > > >
> >> > >> > > > BTW, the delete marker in the regions are mostly
> deleteFamily if
> >> > it
> >> > >> > > > matters.
> >> > >> > > >
> >> > >> > > > This is really weird. Anyone has any clue for this
strange
> >> > behavior?
> >> > >> > > >
> >> > >> > > > Thanks
> >> > >> > > > Tian-Ying
> >> > >> > > >
> >> > >> > > > These 0.94 Tian-Ying?
> >> > >> > >
> >> > >> > > It looks like the DeleteFamily is retained only; do
you see
> >> > incidence
> >> > >> > where
> >> > >> > > there may have been versions older than the DeleteFamily
that
> are
> >> > also
> >> > >> > > retained post-major-compaction?
> >> > >> > >
> >> > >> > > St.Ack
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > > A snippet of the HFile generated by the major compaction:
> >> > >> > > >
> >> > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts=
> 2292870047
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts=
> 2292869794
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > >
> \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > >
> \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > >
> \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > >
> \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts=
> 2289446916
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts=
> 2288670451
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts=
> 2287911443
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts=
> 2287213792
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts=
> 2286488738
> >> > >> > > > V:
> >> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> >> > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts=
> 2285778927
> >> > >> > > > V:
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message