hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Mohamed Zahoor <jmo...@gmail.com>
Subject Re: Slow scanning for PrefixFilter on EncodedBlocks
Date Wed, 17 Oct 2012 08:44:57 GMT
First i upgraded my cluster to 94.2.. even then the problem persisted..
Then i moved to using startRow instead of prefix filter..


,/zahoor

On Wed, Oct 17, 2012 at 2:12 PM, J Mohamed Zahoor <jmozah@gmail.com> wrote:

> Sorry for the delay.
>
> It looks like the problem is because of PrefixFilter...
> I assumed that i does a seek...
>
> If i use startRow instead.. it works fine.. But is it the correct approach?
>
> ./zahoor
>
>
> On Wed, Oct 17, 2012 at 3:38 AM, lars hofhansl <lhofhansl@yahoo.com>wrote:
>
>> I reopened HBASE-6577
>>
>>
>>
>> ----- Original Message -----
>> From: lars hofhansl <lhofhansl@yahoo.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org>; lars hofhansl <
>> lhofhansl@yahoo.com>
>> Cc:
>> Sent: Tuesday, October 16, 2012 2:39 PM
>> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks
>>
>> Looks like this is exactly the scenario I was trying to optimize with
>> HBASE-6577. Hmm...
>> ________________________________
>> From: lars hofhansl <lhofhansl@yahoo.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org>
>> Sent: Tuesday, October 16, 2012 12:21 AM
>> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks
>>
>> PrefixFilter does not do any seeking by itself, so I doubt this is
>> related to HBASE-6757.
>> Does this only happen with FAST_DIFF compression?
>>
>>
>> If you can create an isolated test program (that sets up the scenario and
>> then runs a scan with the filter such that it is very slow), I'm happy to
>> take a look.
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: J Mohamed Zahoor <jmozah@gmail.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org>
>> Cc:
>> Sent: Monday, October 15, 2012 10:27 AM
>> Subject: Re: Slow scanning for PrefixFilter on EncodedBlocks
>>
>> Is this related to HBASE-6757 ?
>> I use a filter list with
>>   - prefix filter
>>   - filter list of column filters
>>
>> /zahoor
>>
>> On Monday, October 15, 2012, J Mohamed Zahoor wrote:
>>
>> > Hi
>> >
>> > My scanner performance is very slow when using a Prefix filter on a
>> > **Encoded Column** ( encoded using FAST_DIFF on both memory and disk).
>> > I am using 94.1 hbase.
>> >
>> > jstack shows that much time is spent on seeking the row.
>> > Even if i give a exact row key match in the prefix filter it takes about
>> > two minutes to return a single row.
>> > Running this multiple times also seems to be redirecting things to disk
>> > (loadBlock).
>> >
>> >
>> > at
>> >
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027)
>> > at
>> >
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461)
>> >  at
>> >
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493)
>> > at
>> >
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167)
>> > at
>> >
>> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521)
>> > - locked <0x000000059584fab8> (a
>> > org.apache.hadoop.hbase.regionserver.StoreScanner)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402)
>> > - locked <0x000000059584fab8> (a
>> > org.apache.hadoop.hbase.regionserver.StoreScanner)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507)
>> > at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3455)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3406)
>> > - locked <0x000000059589bb30> (a
>> > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
>> >  at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3423)
>> >
>> > If is set the start and end row as same row in scan ... it come in very
>> > quick.
>> >
>> > Saw this link
>> >
>> http://search-hadoop.com/m/9f0JH1Kz24U1&subj=Re+HBase+0+94+2+SNAPSHOT+Scanning+Bug
>> > But it looks like things are fine in 94.1.
>> >
>> > Any pointers on why this is slow?
>> >
>> >
>> > Note: the row has not many columns(5 and less than a kb) and lots of
>> > versions (1500+)
>> >
>> > ./zahoor
>> >
>> >
>> >
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message