hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tigertail <tyc...@yahoo.com>
Subject Re: How to read a subset of records based on a column value in a M/R job?
Date Thu, 18 Dec 2008 22:40:49 GMT

I believe you mean getRow(byte[] row, byte[][] columns, long ts), right?
Expecting 0.19.0 and 0.20.0...What is the release plan BTW?
Thank you so much for all your input!


stack-3 wrote:
> 
> tigertail wrote:
> ...
>> 		RowResult rowResult = this.table.getRow(msgid);
>>
>> With this revision, the job runs very stable now and takes 110 minutes to
>> read 10M records.
>> So for Q1, I can read 1M records in about 11 minutes, this looks ok.
>>
>>   
> Good.  If you were interested in a particular column only, that should 
> run faster (getRow is slow in that it has to make sure in all resources 
> that it has picked up all possible columns that could be on the row 
> whereas get with an explicit column knows it can stop when 
> row+column+timestamp matches.  That said, all this will be faster when 
> we get 0.19.0 out the door (In 0.19.0 it might help if the keys to get 
> are sorted in that then the next value might come out of server-side 
> blockcache) ... and faster again in 0.20.0.
> 
> St.Ack
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-read-a-subset-of-records-based-on-a-column-value-in-a-M-R-job--tp20963771p21082364.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message