hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Large webmail storage and Hbase
Date Wed, 19 Nov 2008 23:54:38 GMT
Thanks for your information. Its really helpful to me :)

On Thu, Nov 20, 2008 at 3:58 AM, Joost Ouwerkerk <joost@openplaces.org> wrote:
> Edward,
>
> We're working on a user-facing web system backed by Hbase.  More
> read-oriented than a mail system, but it does also have web users writing to
> it.  We're making heavy use of memcached because HBase random read is not
> fast enough.  Haven't tried BLOCKCACHE yet, but reading a random row from
> HBase generally costs us about 150ms, which when multiplied by 10-20 records
> is expensive.  We think it's this slow because of the quantity of data we're
> transporting, but haven't fully figured it out yet -- MySQL and memcached
> can deliver the same quantity of data in 1/10th the time.  If you can model
> your data to favour reading with scanners instead of randomly, I'm sure you
> could do much better.  I know that the scanner code was recently optimized
> with a batching strategy.
>
> We're using Solr/Lucene for secondary indexes & searching.  We often display
> indexed results instead of retrieving data from the database.  We generally
> do only one HBase getRow call per user HTTP request, the rest comes from
> Solr or memcached.
>
> We haven't rolled out beyond a small alpha user group, so the system is not
> proven in the real world.  Like Stack says: try it and see what happens.
> And be prepared to switch to an ugly MySQL sharding approach if it doesn't
> work out.
>
> j
>
> On Tue, Nov 18, 2008 at 9:21 PM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> Does anyone have some opinion about this?
>>
>> On Tue, Nov 18, 2008 at 11:18 AM, Edward J. Yoon <edwardyoon@apache.org>
>> wrote:
>> > Hi,
>> >
>> > I'm considering to store the large-scale web-mail data on the Hbase.
>> > IMO, I expect to be able to solve both real-time  and batch (e.g. spam
>> > filtering, from/to graph, ..., etc) issues. But I'm still not sure
>> > whether it's suitable for storing web mail data. The stable online
>> > real-time service should be possible to be a web mail service.
>> >
>> > Does anyone tried similar one (real-time application), Or know about
>> > gmail architecture?
>> > Any advices are welcome, Thanks!
>> >
>> > --
>> > Best Regards, Edward J. Yoon @ NHN, corp.
>> > edwardyoon@apache.org
>> > http://blog.udanax.org
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon @ NHN, corp.
>> edwardyoon@apache.org
>> http://blog.udanax.org
>>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message