hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Veentjer <alarmnum...@gmail.com>
Subject Re: Using HBase in combination with HDFS directly
Date Wed, 05 Jan 2011 15:34:57 GMT
On Wed, Jan 5, 2011 at 4:03 PM, Friso van Vollenhoven <
fvanvollenhoven@xebia.com> wrote:

> Hi Peter,
> Do you mean you want to use the HDFS that HBase relies on for other things
> and not just exclusively HBase? That should be just fine. We do it all the
> time.
Ok thanks.

> Are you worried about putting to much load on it?

For the POC it won't matter that much. I can get my stuff up and running.

> I guess that depends on the type of work load that you have and what you do
> with it. But generally I think it is nice to have all nodes be the same (so
> all workers are datanode and region server), such that you don't have to
> scale out them separately.

>>Peter, are you based in The Netherlands by any chance? There is a NoSQL
meetup group in NL (http://www.meetup.com/nosql-nl/) with >>meetups every
now and then. Next one is at January 24 and is all about HBase. We're doing
a on the spot install on a number of present >>laptops to create a temporary
cluster and play around with it. I have been working with Hadoop and HBase
for the past couple of months, so if >>you care to come by, I'd be happy to
share some experiences.

Yet I live in Holland. I'm a former Xebia employee :) I think I'll visit one
of the nosql meetups.

We are building a kind of application server where instead of providing
services like JMS, Servlet, EJB's etc we are providing services for secured
document storage, message exchange, semantic analysis of documents etc. It
is all based on GigaSpaces but I have the impression (after working more
than a year with it) that is is very time consuming to get right. Apart from
all the correctness issues (and there where/are many.. based on bad usage of
GigaSpaces and architectural choices) there are also some
performance/scalability issues that need solving.

So I decided to rewrite the main use cases using HBase. I had most of the
functionality up and running in a few days and most of the 'bad
architectural choices' we are going to remove in the next 6 months are not
there from the beginning (e.g. using streams instead of byte arrays for
document processing.. how stupid can you be). It also was a nice exercise to
play with HBase and less consistent solutions.

I normally work on realizing very high consistency for Multiverse:


So I want to have some hands on experience with using less consistent

> Friso
> On 5 jan 2011, at 14:41, Peter Veentjer wrote:
> > Hi Guys,
> >
> > I'm currently writing a POC based on hbase and I spend more time on
> writing
> > a ui than on writing the hbase functionality. So I'm very excited about
> > exploring HBase further and doing some serious performance and
> scalability
> > tests and see if we can use it as core technology instead of the
> > time/resource intensive Gigaspaces.
> >
> > My question:
> >
> > I'm currently using HBase and I also want to use the HDFS directly to
> store
> > files. If the HBase server(s) is installed, can I directly access the
> > of these servers or is it better to set up a seperate Hadoop server for
> > running HDFS.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message