hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: HBase Scan consumes high cpu
Date Fri, 13 Sep 2019 04:53:45 GMT
Hi
            When you did a put with a lower qualifier int  (put 'mytable',
'MY_ROW', "pcf:\x0A", "\x00") the system flow is getting a valid cell at
1st step itself and that getting passed to the Filter.  The Filter is doing
a seek which just avoids all the in between deletes and puts processing..
In 1st case the Filter wont get into action at all unless the scan flow
sees a valid cell.  The delete processing happens as 1st step before the
filter processinf step happening.

In this case I am wondering why you can not add the specific 1st qualifier
in the get part itself along with the column range filter.   I mean

get 'mytable', 'MY_ROW', {COLUMN=>['pcf: *1499000 * '],
FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),
true, Bytes.toBytes(1499010.to_java(:int)), false)}

Pardon the syntax it might not be proper for the shell..  Can this be done?
This will make the scan to make a seek to the given qualifier at 1st step
itself.

Anoop

On Thu, Sep 12, 2019 at 10:18 PM Udai Bhan Kashyap (BLOOMBERG/ PRINCETON) <
ukashyap1@bloomberg.net> wrote:

> Are you keeping the deleted cells? Check 'VERSIONS' for the column family
> and set it to 1 if you don't want to keep the deleted cells.
>
> From: user@hbase.apache.org At: 09/12/19 12:40:01To:
> user@hbase.apache.org
> Subject: Re: HBase Scan consumes high cpu
>
> Hi,
>
> As said earlier, we have populated the rowkey "MY_ROW" with integers
> from 0 to 1500000 as column qualifiers. Then we have deleted the
> qualifiers from 0 to 1499000.
>
> We executed the following query. It took 15.3750 seconds to execute.
>
> hbase(main):057:0> get 'mytable', 'MY_ROW', {COLUMN=>['pcf'],
> FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),
> true, Bytes.toBytes(1499010.to_java(:int)), false)}
> COLUMN                            CELL
>   pcf:\x00\x16\xDFx                timestamp=1568123881899,
> value=\x00\x16\xDFx
>   pcf:\x00\x16\xDFy                timestamp=1568123881899,
> value=\x00\x16\xDFy
>   pcf:\x00\x16\xDFz                timestamp=1568123881899,
> value=\x00\x16\xDFz
>   pcf:\x00\x16\xDF{                timestamp=1568123881899,
> value=\x00\x16\xDF{
>   pcf:\x00\x16\xDF|                timestamp=1568123881899,
> value=\x00\x16\xDF|
>   pcf:\x00\x16\xDF}                timestamp=1568123881899,
> value=\x00\x16\xDF}
>   pcf:\x00\x16\xDF~                timestamp=1568123881899,
> value=\x00\x16\xDF~
>   pcf:\x00\x16\xDF\x7F             timestamp=1568123881899,
> value=\x00\x16\xDF\x7F
>   pcf:\x00\x16\xDF\x80             timestamp=1568123881899,
> value=\x00\x16\xDF\x80
>   pcf:\x00\x16\xDF\x81             timestamp=1568123881899,
> value=\x00\x16\xDF\x81
> 1 row(s) in 15.3750 seconds
>
>
> Now we inserted a new column with qualifier 10 (\x0A), such that it
> comes earlier in lexicographical order. Now we executed the same query.
> It only took 0.0240 seconds.
>
> hbase(main):058:0> put 'mytable', 'MY_ROW', "pcf:\x0A", "\x00"
> 0 row(s) in 0.0150 seconds
> hbase(main):059:0> get 'mytable', 'MY_ROW', {COLUMN=>['pcf'],
> FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),
> true, Bytes.toBytes(1499010.to_java(:int)), false)}
> COLUMN                            CELL
>   pcf:\x00\x16\xDFx                timestamp=1568123881899,
> value=\x00\x16\xDFx
>   pcf:\x00\x16\xDFy                timestamp=1568123881899,
> value=\x00\x16\xDFy
>   pcf:\x00\x16\xDFz                timestamp=1568123881899,
> value=\x00\x16\xDFz
>   pcf:\x00\x16\xDF{                timestamp=1568123881899,
> value=\x00\x16\xDF{
>   pcf:\x00\x16\xDF|                timestamp=1568123881899,
> value=\x00\x16\xDF|
>   pcf:\x00\x16\xDF}                timestamp=1568123881899,
> value=\x00\x16\xDF}
>   pcf:\x00\x16\xDF~                timestamp=1568123881899,
> value=\x00\x16\xDF~
>   pcf:\x00\x16\xDF\x7F             timestamp=1568123881899,
> value=\x00\x16\xDF\x7F
>   pcf:\x00\x16\xDF\x80             timestamp=1568123881899,
> value=\x00\x16\xDF\x80
>   pcf:\x00\x16\xDF\x81             timestamp=1568123881899,
> value=\x00\x16\xDF\x81
> 1 row(s) in 0.0240 seconds
> hbase(main):060:0>
>
>
> We were able to reproduce the result consistently same, the pattern
> being bulk insert followed by bulk delete of most of the earlier columns.
>
>
> We observed the following behaviour while debugging the StoreScanner
> (regionserver).
>
> Case 1:
>
> 1. When StoreScanner.next() is called, it starts to iterate over the
> cells from the start of the rowkey.
>
> 2. As all the cells are deleted (from 0 to 1499000), we could see
> alternate delete and put type cells. Now, the
> NormalUserScanQueryMatcher.match() returns
> ScanQueryMatcher.MatchCode.SKIP and
> ScanQueryMatcher.MatchCode.SEEK_NEXT_COL for Delete and Put type cell
> respectively. This iteration happens throughout the range of 0 to 1499000.
>
> 3. This happens until a valid Put type cell is encountered, where the
> matcher applies the ColumnRangeFilter to the cell, which in turm returns
> ScanQueryMatcher.MatchCode.SEEK_NEXT_USING_HINT. In the next iteration
> it seeks directly to the desired column.
>
>
> Case 2:
>
> 1. When StoreScanner.next() is called, it starts to iterate over the
> cells from the start of the rowkey.
>
> 2. When the Put cell of qualifier 10 (\x0A) is encountered, the matcher
> returns ScanQueryMatcher.MatchCode.SEEK_NEXT_USING_HINT. In the next
> iteration it seeks directly to the desired column.
>
>
> Please let us know if this behaviour is intentional or it could be avoided.
>
> Regards,
>
> Solvannan R M
>
>
> On 2019/09/10 17:12:36, Josh Elser wrote:
>  > Deletes are held in memory. They represent data you have to traverse >
>  > until that data is flushed out to disk. When you write a new cell
> with a >
>  > qualifier of 10, that sorts, lexicographically, "early" with respect
> to >
>  > the other qualifiers you've written.>
>  >
>  > By that measure, if you are only scanning for the first column in this >
>  > row which you've loaded with deletes, it would make total sense to me >
>  > that the first case is slow and the second fast is fast>
>  >
>  > Can you please share exactly how you execute your "query" for
> both(all) >
>  > scenarios?>
>  >
>  > On 9/10/19 11:35 AM, Solvannan R M wrote:>
>  > > Hi,>
>  > > >
>  > > We have been using HBase (1.4.9) for a case where timeseries data
> is continuously inserted and deleted (high churn) against a single
> rowkey. The column keys would represent timestamp more or less. When we
> scan this data using ColumnRangeFilter for a recent time-range, scanner
> for the stores (memstore & storefiles) has to go through contiguous
> deletes, before it reaches the requested timerange data. While using
> this scan, we could notice 100% cpu usages in single core by the
> regionserver process.>
>  > > >
>  > > So, for our case, most of the cells with older timestamps will be
> in deleted state. While traversing these deleted cells, the regionserver
> process causing 100% cpu usage in single core.>
>  > > >
>  > > We tried to trace the code for scan and we observed the following
> behaviour.>
>  > > >
>  > > 1. While scanner is initialized, it seeked all the store-scanners
> to the start of the rowkey.>
>  > > 2. Then it traverses the deleted cells and discards it (as it was
> deleted) one by one.>
>  > > 3. When it encounters a valid cell (put type), it applies the
> filter and it returns SEEK_TO_NEXT_USING_HINT.>
>  > > 4. Now the scanner seeks to the required key directly and returning
> the results quickly then.>
>  > > >
>  > > For confirming the mentioned behaviour, we have done a test:>
>  > > 1. We have populated a single rowkey with column qualifier as a
> range of integers of 0 to 1500000 with random data.>
>  > > 2. We then deleted the column qualifier range of 0 to 1499000.>
>  > > 3. Now the data is only in memsore. No store file exists.>
>  > > 4. Now we scanned the rowkey with ColumnRangeFilter[1499000,
> 1499010).>
>  > > 5. The query took 12 seconds to execute. During this query, a
> single core is completely used>
>  > > 6. Then we put a new cell with qualifier 10.>
>  > > 7. Executed the same query, it took 0.018 seconds to execute.>
>  > > >
>  > > Kindly check this and advise !.>
>  > > >
>  > > Regards,>
>  > > Solvannan R M>
>  > > >
>  >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message