hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Solvannan R M <solvanna...@zoho.com.INVALID>
Subject Re: HBase Scan consumes high cpu
Date Sun, 15 Sep 2019 05:18:20 GMT
Hi Ramkrishna,

Thank you for your inputs! Unfortunately we would not be knowing the 
column names beforehand. We had generated the above scenario for 
illustration purposes.

The intent of our query is that, given a single row key, a start column 
key and an end column key, scan for the columns that are between the two 
column keys.  We have been achieving that by using ColumnRangeFilter. 
Our write pattern would be Put followed by Delete immediately 
(Keep_deleted_cells is set to false). So as more Deletes start to 
accumulate, we notice the scan time starts to be very long and the cpu 
shoots up to 100% for a core during every scan. On trying to debug we 
observed the following behavior:

At any instant, the cells of the particular row would be roughly 
organized like

D1 P1 D2 P2 D3 P3 ............ Dn-1 Pn-1 Dn Pn Pn+1 Pn+2 Pn+3 Pn+4....

where D and P are Delete and it's corresponding Put. The newer values 
from Pn haven't been deleted yet.

As the scan initiates, inside the StoreScanner, 
NormalUserScanQueryMatcher would match the first cell (D1). It would be 
added to the DeleteTracker and a MatchCode of SKIP is returned. Now for 
the next cell (P1) the matcher would check with the DeleteTracker and 
return a code of SEEK_NEXT_COL. Again the next cell would be D2 and this 
would happen alternately. No filter is applied. This goes on till it 
encounters Pn where filter is applied, SEEK_NEXT_USING_HINT is done and 
now reseek happens to position near the desired range. The result is 
returned quickly after that.

The SKIP iterations happen a lot because our pattern would have very 
less active cells and only towards the latest column qualifiers(ordered 
high lexicographically). We were wondering if the query could be 
modified so that the filter could be applied initially or some other way 
to seek to the desired range directly.

Regards,
Solvannan R M


On 2019/09/13 15:53:51, ramkrishna vasudevan wrote:
 > Hi>
 > Generally if you can form the column names like you did in the above 
case>
 > it is always better you add them using>
 > scan#addColumn(family, qual). I am not sure of the shell syntax to add>
 > multiple columns but am sure there is a provision to do it.>
 >
 > This will ensure that the scan starts from the given column and 
fetches the>
 > required column only. In your case probably you need to pass a set of>
 > qualifiers (instead of just 1).>
 >
 > Regards>
 > Ram>
 >
 > On Fri, Sep 13, 2019 at 8:45 PM Solvannan R M >
 > wrote:>
 >
 > > Hi Anoop,>
 > >>
 > > We have executed the query with the qualifier set like you advised.>
 > > But we dont get the results for the range but only the specified>
 > > qualifier cell is returned.>
 > >>
 > > Query & Result:>
 > >>
 > > hbase(main):008:0> get 'mytable', 'MY_ROW',>
 > > {COLUMN=>["pcf:\x00\x16\xDFx"],>
 > > FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),>
 > > true, Bytes.toBytes(1499010.to_java(:int)), false)}>
 > > COLUMN CELL>
 > > pcf:\x00\x16\xDFx timestamp=1568380663616,>
 > > value=\x00\x16\xDFx>
 > > 1 row(s) in 0.0080 seconds>
 > >>
 > > hbase(main):009:0>>
 > >>
 > >>
 > > Is there any other way to get arond this ?.>
 > >>
 > >>
 > > Regards,>
 > >>
 > > Solvannan R M>
 > >>
 > >>
 > > On 2019/09/13 04:53:45, Anoop John wrote:>
 > > > Hi>>
 > > > When you did a put with a lower qualifier int (put 'mytable',>>
 > > > 'MY_ROW', "pcf:\x0A", "\x00") the system flow is getting a valid 
cell>
 > > at>>
 > > > 1st step itself and that getting passed to the Filter. The Filter 
is>
 > > doing>>
 > > > a seek which just avoids all the in between deletes and puts>
 > > processing..>>
 > > > In 1st case the Filter wont get into action at all unless the 
scan flow>>
 > > > sees a valid cell. The delete processing happens as 1st step 
before the>>
 > > > filter processinf step happening.>>
 > > >>
 > > > In this case I am wondering why you can not add the specific 1st>
 > > qualifier>>
 > > > in the get part itself along with the column range filter. I mean>>
 > > >>
 > > > get 'mytable', 'MY_ROW', {COLUMN=>['pcf: *1499000 * '],>>
 > > > 
FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),>>
 > > > true, Bytes.toBytes(1499010.to_java(:int)), false)}>>
 > > >>
 > > > Pardon the syntax it might not be proper for the shell.. Can this 
be>
 > > done?>>
 > > > This will make the scan to make a seek to the given qualifier at 
1st>
 > > step>>
 > > > itself.>>
 > > >>
 > > > Anoop>>
 > > >>
 > > > On Thu, Sep 12, 2019 at 10:18 PM Udai Bhan Kashyap (BLOOMBERG/>
 > > PRINCETON) <>>
 > > > ukashyap1@bloomberg.net> wrote:>>
 > > >>
 > > > > Are you keeping the deleted cells? Check 'VERSIONS' for the 
column>
 > > family>>
 > > > > and set it to 1 if you don't want to keep the deleted cells.>>
 > > > >>>
 > > > > From: user@hbase.apache.org At: 09/12/19 12:40:01To:>>
 > > > > user@hbase.apache.org>>
 > > > > Subject: Re: HBase Scan consumes high cpu>>
 > > > >>>
 > > > > Hi,>>
 > > > >>>
 > > > > As said earlier, we have populated the rowkey "MY_ROW" with 
integers>>
 > > > > from 0 to 1500000 as column qualifiers. Then we have deleted the>>
 > > > > qualifiers from 0 to 1499000.>>
 > > > >>>
 > > > > We executed the following query. It took 15.3750 seconds to 
execute.>>
 > > > >>>
 > > > > hbase(main):057:0> get 'mytable', 'MY_ROW', {COLUMN=>['pcf'],>>
 > > > > 
FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),>>
 > > > > true, Bytes.toBytes(1499010.to_java(:int)), false)}>>
 > > > > COLUMN CELL>>
 > > > > pcf:\x00\x16\xDFx timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFx>>
 > > > > pcf:\x00\x16\xDFy timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFy>>
 > > > > pcf:\x00\x16\xDFz timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFz>>
 > > > > pcf:\x00\x16\xDF{ timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF{>>
 > > > > pcf:\x00\x16\xDF| timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF|>>
 > > > > pcf:\x00\x16\xDF} timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF}>>
 > > > > pcf:\x00\x16\xDF~ timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF~>>
 > > > > pcf:\x00\x16\xDF\x7F timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x7F>>
 > > > > pcf:\x00\x16\xDF\x80 timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x80>>
 > > > > pcf:\x00\x16\xDF\x81 timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x81>>
 > > > > 1 row(s) in 15.3750 seconds>>
 > > > >>>
 > > > >>>
 > > > > Now we inserted a new column with qualifier 10 (\x0A), such 
that it>>
 > > > > comes earlier in lexicographical order. Now we executed the same>
 > > query.>>
 > > > > It only took 0.0240 seconds.>>
 > > > >>>
 > > > > hbase(main):058:0> put 'mytable', 'MY_ROW', "pcf:\x0A", "\x00">>
 > > > > 0 row(s) in 0.0150 seconds>>
 > > > > hbase(main):059:0> get 'mytable', 'MY_ROW', {COLUMN=>['pcf'],>>
 > > > > 
FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int)),>>
 > > > > true, Bytes.toBytes(1499010.to_java(:int)), false)}>>
 > > > > COLUMN CELL>>
 > > > > pcf:\x00\x16\xDFx timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFx>>
 > > > > pcf:\x00\x16\xDFy timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFy>>
 > > > > pcf:\x00\x16\xDFz timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDFz>>
 > > > > pcf:\x00\x16\xDF{ timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF{>>
 > > > > pcf:\x00\x16\xDF| timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF|>>
 > > > > pcf:\x00\x16\xDF} timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF}>>
 > > > > pcf:\x00\x16\xDF~ timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF~>>
 > > > > pcf:\x00\x16\xDF\x7F timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x7F>>
 > > > > pcf:\x00\x16\xDF\x80 timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x80>>
 > > > > pcf:\x00\x16\xDF\x81 timestamp=1568123881899,>>
 > > > > value=\x00\x16\xDF\x81>>
 > > > > 1 row(s) in 0.0240 seconds>>
 > > > > hbase(main):060:0>>>
 > > > >>>
 > > > >>>
 > > > > We were able to reproduce the result consistently same, the 
pattern>>
 > > > > being bulk insert followed by bulk delete of most of the earlier>
 > > columns.>>
 > > > >>>
 > > > >>>
 > > > > We observed the following behaviour while debugging the 
StoreScanner>>
 > > > > (regionserver).>>
 > > > >>>
 > > > > Case 1:>>
 > > > >>>
 > > > > 1. When StoreScanner.next() is called, it starts to iterate 
over the>>
 > > > > cells from the start of the rowkey.>>
 > > > >>>
 > > > > 2. As all the cells are deleted (from 0 to 1499000), we could 
see>>
 > > > > alternate delete and put type cells. Now, the>>
 > > > > NormalUserScanQueryMatcher.match() returns>>
 > > > > ScanQueryMatcher.MatchCode.SKIP and>>
 > > > > ScanQueryMatcher.MatchCode.SEEK_NEXT_COL for Delete and Put 
type cell>>
 > > > > respectively. This iteration happens throughout the range of 0 to>
 > > 1499000.>>
 > > > >>>
 > > > > 3. This happens until a valid Put type cell is encountered, 
where the>>
 > > > > matcher applies the ColumnRangeFilter to the cell, which in turm>
 > > returns>>
 > > > > ScanQueryMatcher.MatchCode.SEEK_NEXT_USING_HINT. In the next>
 > > iteration>>
 > > > > it seeks directly to the desired column.>>
 > > > >>>
 > > > >>>
 > > > > Case 2:>>
 > > > >>>
 > > > > 1. When StoreScanner.next() is called, it starts to iterate 
over the>>
 > > > > cells from the start of the rowkey.>>
 > > > >>>
 > > > > 2. When the Put cell of qualifier 10 (\x0A) is encountered, the>
 > > matcher>>
 > > > > returns ScanQueryMatcher.MatchCode.SEEK_NEXT_USING_HINT. In the 
next>>
 > > > > iteration it seeks directly to the desired column.>>
 > > > >>>
 > > > >>>
 > > > > Please let us know if this behaviour is intentional or it could 
be>
 > > avoided.>>
 > > > >>>
 > > > > Regards,>>
 > > > >>>
 > > > > Solvannan R M>>
 > > > >>>
 > > > >>>
 > > > > On 2019/09/10 17:12:36, Josh Elser wrote:>>
 > > > > > Deletes are held in memory. They represent data you have to>
 > > traverse >>>
 > > > > > until that data is flushed out to disk. When you write a new 
cell>>
 > > > > with a >>>
 > > > > > qualifier of 10, that sorts, lexicographically, "early" with>
 > > respect>>
 > > > > to >>>
 > > > > > the other qualifiers you've written.>>>
 > > > > >>>
 > > > > > By that measure, if you are only scanning for the first 
column in>
 > > this >>>
 > > > > > row which you've loaded with deletes, it would make total sense>
 > > to me >>>
 > > > > > that the first case is slow and the second fast is fast>>>
 > > > > >>>
 > > > > > Can you please share exactly how you execute your "query" for>>
 > > > > both(all) >>>
 > > > > > scenarios?>>>
 > > > > >>>
 > > > > > On 9/10/19 11:35 AM, Solvannan R M wrote:>>>
 > > > > > > Hi,>>>
 > > > > > > >>>
 > > > > > > We have been using HBase (1.4.9) for a case where 
timeseries data>>
 > > > > is continuously inserted and deleted (high churn) against a 
single>>
 > > > > rowkey. The column keys would represent timestamp more or less.>
 > > When we>>
 > > > > scan this data using ColumnRangeFilter for a recent time-range,>
 > > scanner>>
 > > > > for the stores (memstore & storefiles) has to go through 
contiguous>>
 > > > > deletes, before it reaches the requested timerange data. While 
using>>
 > > > > this scan, we could notice 100% cpu usages in single core by the>>
 > > > > regionserver process.>>>
 > > > > > > >>>
 > > > > > > So, for our case, most of the cells with older timestamps 
will be>>
 > > > > in deleted state. While traversing these deleted cells, the>
 > > regionserver>>
 > > > > process causing 100% cpu usage in single core.>>>
 > > > > > > >>>
 > > > > > > We tried to trace the code for scan and we observed the 
following>>
 > > > > behaviour.>>>
 > > > > > > >>>
 > > > > > > 1. While scanner is initialized, it seeked all the 
store-scanners>>
 > > > > to the start of the rowkey.>>>
 > > > > > > 2. Then it traverses the deleted cells and discards it (as

it was>>
 > > > > deleted) one by one.>>>
 > > > > > > 3. When it encounters a valid cell (put type), it applies 
the>>
 > > > > filter and it returns SEEK_TO_NEXT_USING_HINT.>>>
 > > > > > > 4. Now the scanner seeks to the required key directly and>
 > > returning>>
 > > > > the results quickly then.>>>
 > > > > > > >>>
 > > > > > > For confirming the mentioned behaviour, we have done a 
test:>>>
 > > > > > > 1. We have populated a single rowkey with column qualifier

as a>>
 > > > > range of integers of 0 to 1500000 with random data.>>>
 > > > > > > 2. We then deleted the column qualifier range of 0 to 
1499000.>>>
 > > > > > > 3. Now the data is only in memsore. No store file exists.>>>
 > > > > > > 4. Now we scanned the rowkey with ColumnRangeFilter[1499000,>>
 > > > > 1499010).>>>
 > > > > > > 5. The query took 12 seconds to execute. During this query,

a>>
 > > > > single core is completely used>>>
 > > > > > > 6. Then we put a new cell with qualifier 10.>>>
 > > > > > > 7. Executed the same query, it took 0.018 seconds to 
execute.>>>
 > > > > > > >>>
 > > > > > > Kindly check this and advise !.>>>
 > > > > > > >>>
 > > > > > > Regards,>>>
 > > > > > > Solvannan R M>>>
 > > > > > > >>>
 > > > > >>>
 > > > >>>
 > > > >>>
 > > > >>>
 > > >>
 > >>
 >

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message