hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Theoretical question...
Date Fri, 30 Apr 2010 13:14:45 GMT


Not exactly.

Within HBase, if you have access, you can do anything to any resource. I don't believe there's
a concept of permissions. (Unless you can use the HDFS permissions inside HBase...)

So one idea was to isolate the hbase instance within the cloud.
Since people talk about isolating hbase to different nodes than hadoop datanodes, this kind
of makes sense.

I'm not a big fan of HOD with respect to virtualized clouds. And to be clear I mean taking
a physical box and splitting it in to virtualized servers. Running HOD on a large cloud of
100+ physical servers may have some value in a corporate cloud that is a shared resource.

I'm sure there are folks at Sun, Dell, and IBM who would disagree with me, but when I take
a 8 core Intel box, and then split it in to two virtual 3 core boxes, I wonder if there is
going to be better performance than if I left it as a single 8 core box and ran more m/r jobs?

I realize that we have two issues for discussion. One is getting around the lack of security
and permissions within HBase, the other is virtualization.

Both are interesting areas for discussion.



> Date: Thu, 29 Apr 2010 18:09:35 -0700
> From: apurtell@apache.org
> Subject: Re: Theoretical question...
> To: hbase-user@hadoop.apache.org
> > From: Michael Segel
> > Imagine you have a cloud of 100 hadoop nodes.
> > In theory you could create multiple instances of HBase on
> > the cloud.
> > Obviously I don't think you could have multiple region
> > servers running on the same node.
> > The use case I was thinking about if you have a centralized
> > hadoop cloud and you wanted to have multiple developer
> > groups sharing the cloud as a resource rather than building
> > their own clouds.
> This is somewhat like HOD (http://hadoop.apache.org/common/docs/current/hod_user_guide.html).
Have you looked at that?
> > The reason for the multiple hbase instances is that you
> > don't have a way of setting up multiple instances like
> > different Informix or Oracle databases/schemas on the same
> > infrastructure.
> Right.
> Well there is a simple (and under development, lightly tested as yet, etc.) multiuser
mode in Stargate that gives multiple users each the illusion of a private HBase instance while
sharing a common HBase cluster underneath. This is something I'll continue to work on as I
have time.
> Also my employer is sponsoring development of HBASE-1697, and integration of HBase into
the secure version of Hadoop (http://bit.ly/75011o) that Yahoo is working on. For example
HBase would offer RBAC and might also use HDFS block tokens in a manner that allows you to
reason about user isolation down through the whole stack.
> salesforce.com is a multitenant service built on a shared database infrastructure. They've
talked about their rationale for building their SaaS service this way. It's worth it to Google
a bit to find it and read. Partitioning cloud resources increases management complexity and
reduces the benefit of the cloud -- the efficiencies of scale. It's technically possible to
partition cloud resources but economically inefficient and suboptimal to do so.
>    - Andy
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message