hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject RE: in-memory data grid vs. ehcache + hbase
Date Sun, 12 Jun 2011 15:20:53 GMT
> From: Hiller, Dean  x66079 <dean.hiller@broadridge.com>
> I would think most domains have a low write, high read rate

There are IN_MEMORY tables and blockcache in general for that.

> with low number of rows in certain tables so I am kinda
> surprised this optimization is not there.

Right, you want to replicate and distribute that data to avoid hotspotting.

We have a notion of handling this with:

  https://issues.apache.org/jira/browse/HBASE-2357

... pending available time apart from production concerns and bug fixes of course, or new
contribution.

This might accommodate those who want an out-of-the-box solution. It's commonly heard that
HBase has too many moving parts (HDFS, ZK, etc.). Suggesting the user layer a caching tier
on top using Hazelcast or whatever won't help our case there.

I would use Hazelcast or ehcache today if I wanted to layer a distributed hot value cache
on top of HBase. However the tradeoffs one assumes then are, I'd argue, application specific
concerns, which is why a generic solution has not emerged beyond some products that offer
configurable read and write quora and, sometimes, tools for reconciling conflicting edits.
That punts all the complexity up to the application anyway. I don't know if those knobs make
it any easer then if applications simply layer their own caching logic on top of consistent
storage.

Pushing caching into HBase via HBASE-2357 or creating an external generic caching tier for
HBase may not be any more satisfactory. Remains to be seen.

  - Andy

--- On Sun, 6/12/11, Hiller, Dean  x66079 <dean.hiller@broadridge.com> wrote:

> From: Hiller, Dean  x66079 <dean.hiller@broadridge.com>
> Subject: RE: in-memory data grid vs. ehcache + hbase
> To: "Stack" <stack@duboce.net>, "user@hbase.apache.org" <user@hbase.apache.org>
> Cc: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>
> Date: Sunday, June 12, 2011, 7:46 AM
> That is the backup plan if there
> isn't something out there of course ;).  I just
> expected something since it is such a huge performance
> improvement.  In fact, we may just re-implement the api
> so hopefully, it is a feature you just flip on and off and
> preferably a per table feature too....the speed up was so
> immense that I expected an existing project is all....I
> would think most domains have a low write, high read rate
> with low number of rows in certain tables so I am kinda
> surprised this optimization is not there.
> 
> Thanks,
> Dean
> 
> -----Original Message-----
> From: saint.ack@gmail.com
> [mailto:saint.ack@gmail.com]
> On Behalf Of Stack
> Sent: Thursday, June 09, 2011 10:39 AM
> To: user@hbase.apache.org
> Cc: Hiller, Dean x66079; hbase-user@hadoop.apache.org
> Subject: Re: in-memory data grid vs. ehcache + hbase
> 
> I don't know of a generic soln to the prob. you
> describe.  Sounds like
> you have hacked up something for your purposes only the
> local cache is
> read-only?  Can't you change your inserts so they
> update both hbase
> and push out to local cache?
> 
> St.Ack
> 
> On Thu, Jun 9, 2011 at 9:20 AM, Hiller, Dean  x66079
> <dean.hiller@broadridge.com>
> wrote:
> > Oh, and I was hoping something like this kind of
> framework using the hbase slaves file was already
> existence...hard to believe it would not be since our
> performance increase would be around 100 times in this
> case....we are currently using something other than hbase
> and when we change to this type of design it flies.
> >
> > Thanks,
> > Dean
> >
> > -----Original Message-----
> > From: Hiller, Dean x66079
> > Sent: Thursday, June 09, 2011 10:16 AM
> > To: user@hbase.apache.org
> > Cc: hbase-user@hadoop.apache.org
> > Subject: RE: in-memory data grid vs. ehcache + hbase
> >
> > Well, I was hoping there is something with ehcache or
> some kind of cache where it would work like this
> >
> > 1. write using hbase client into the grid which came
> from some web update(which is VERY rare occurrence as this
> is admin stuff)
> > 2. write something out to all nodes telling it to
> evict the stale entry from the cache
> >
> > Then on the next read on any node, it gets the new
> data.  It is okay if one node gets a different value during
> the transition to the new value than another node and that
> it becomes eventually consistent.
> >
> > Thanks,
> > Dean
> >
> > -----Original Message-----
> > From: saint.ack@gmail.com
> [mailto:saint.ack@gmail.com]
> On Behalf Of Stack
> > Sent: Wednesday, June 08, 2011 12:01 PM
> > To: user@hbase.apache.org
> > Cc: hbase-user@hadoop.apache.org
> > Subject: Re: in-memory data grid vs. ehcache + hbase
> >
> > On Wed, Jun 8, 2011 at 9:00 AM, Hiller, Dean  x66079
> > <dean.hiller@broadridge.com>
> wrote:
> >> We have certain tables with under 10 rows, one
> under 200 rows and one with 1,000,000 rows.  We have found
> out that having a copy/cache on each node is EXTREMELY fast
> for our batch processing since these copies of data are
> local AND in-memory.  The issue I am struggling with is the
> best way to evict stale entries from the cache since these
> entries are rarely updated in our system, but we still need
> to evict them from all nodes.  Anyone else struggling with
> this problem?
> >
> > You are caching hbase content in ehcache and you are
> trying to figure
> > how to have ehcache have a true reflection of hbase
> content?
> > St.Ack
> > This message and any attachments are intended only for
> the use of the addressee and
> > may contain information that is privileged and
> confidential. If the reader of the
> > message is not the intended recipient or an authorized
> representative of the
> > intended recipient, you are hereby notified that any
> dissemination of this
> > communication is strictly prohibited. If you have
> received this communication in
> > error, please notify us immediately by e-mail and
> delete the message and any
> > attachments from your system.
> >
> >
> This message and any attachments are intended only for the
> use of the addressee and
> may contain information that is privileged and
> confidential. If the reader of the 
> message is not the intended recipient or an authorized
> representative of the
> intended recipient, you are hereby notified that any
> dissemination of this
> communication is strictly prohibited. If you have received
> this communication in
> error, please notify us immediately by e-mail and delete
> the message and any
> attachments from your system.
> 
>

Mime
View raw message