hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Hbase performance with HDFS
Date Thu, 07 Jul 2011 19:09:24 GMT
Writing, we dump files out that are close to hdfs block size when we flush.

Reading, the files have an index so we'll know where to seek to in
hdfs to find a particular value.  We then pull in a block of the hdfs
file -- 64k (not 64MB) -- into memory and will keep it in an LRU block
cache in case subsequent reads are for same block.  We always read 64k
from HDFS (though this is configurable and some report better random
read perf when smaller than 64k blocks are used).

What else do you want to know?

St.Ack



On Thu, Jul 7, 2011 at 12:05 PM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:
> I understand that but like I mentioned properties of HDFS is lot
> different than the requirements of Hbase in terms of access patterns,
> row sizes, write size, lot of updates, deletes etc. So I was trying to
> understand how they are able to work together and also give desired
> performance.
>
> On Thu, Jul 7, 2011 at 11:59 AM, Himanshu Vashishtha
> <hvashish@cs.ualberta.ca> wrote:
>> Mohit,
>> just like how SSTables are stored on GFS?
>> BigTable sstable => HBase HFile.
>>
>> Does this help?
>> Himanshu
>>
>> On Thu, Jul 7, 2011 at 12:53 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:
>>
>>> I have looked at bigtable and it's ssTables etc. But my question is
>>> directly related to how it's used with HDFS. HDFS recommends large
>>> files, bigger blocks, write once and read many sequential reads. But
>>> accessing small rows and writing small rows is more random and
>>> different than inherent design of HDFS. How do these 2 go together and
>>> is able to provide performance.
>>>
>>> On Thu, Jul 7, 2011 at 11:22 AM, Andrew Purtell <apurtell@apache.org>
>>> wrote:
>>> > Hi Mohit,
>>> >
>>> > Start here: http://labs.google.com/papers/bigtable.html
>>> >
>>> > Best regards,
>>> >
>>> >
>>> >     - Andy
>>> >
>>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>> >
>>> >
>>> >>________________________________
>>> >>From: Mohit Anchlia <mohitanchlia@gmail.com>
>>> >>To: user@hbase.apache.org
>>> >>Sent: Thursday, July 7, 2011 11:12 AM
>>> >>Subject: Hbase performance with HDFS
>>> >>
>>> >>I've been trying to understand how Hbase can provide good performance
>>> >>using HDFS when purpose of HDFS is sequential large block sizes which
>>> >>is inherently different than of Hbase where it's more random and row
>>> >>sizes might be very small.
>>> >>
>>> >>I am reading this but doesn't answer my question. It does say that
>>> >>HFile block size is different but how it really works with HDFS is
>>> >>what I am trying to understand.
>>> >>
>>> >>http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
>>> >>
>>> >>
>>> >>
>>>
>>
>

Mime
View raw message