hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject HBase mention in VLDB keynote
Date Tue, 25 Aug 2009 10:17:12 GMT
In this keynote address here at VLDB 2009 (http://vldb2009.org/?q=node/22) Raghu Ramakrishnan,
Yahoo! Research's Chief Scientist, made prominent mention of HBase, much to my surprise (and
later chagrin). This happened near the end of the talk when a number of the new elastic/scalable/"nosql"
storage systems were discussed to make concrete some of the architectural and data model points
made earlier. The alternatives considered were Yahoo's PNUTS, sharded MySQL, HBase, and Cassandra.
I don't know what version of HBase was used exactly but unfortunately the message was "not
ready yet". Perhaps it was a configuration or provisioning issue but HBase did not really
survive the evaluation, leading to short hyperbolic performance curves terminating on the
far left of the various graphs. This was quite disappointing to see as the other alternatives
were apparently successfully tested on what can be presumed to be the same resources. It stands
to reason there is
 opportunity for HBase to improve here if only we know what that is. It was also a little
disappointing that it appears through a mailing list search that these issues were not brought
to either hbase-dev@ or hbase-users@, only a minor question relating to the REST interface.
Perhaps the community could have identified a specific configuration problem, recommended
a correction for a deployment/provisioning error, or resolved a bug. To future evaluators
of HBase, on behalf of the community I humbly request that you share you results, good or
bad, so we can take the feedback, or the bug reports and their artifacts (logs, etc.) and
improve our software. 

At least, the story has already changed from what was presented today -- for example, the
multimaster architecture of 0.20 was not presented, rather the older one (circa 0.19); and
JG's/Ryan's performance test results for 0.20 stand as a contradiction. We should look into
opportunities to produce a peer reviewed positive contribution. I think we have opportunities
to take some novel approaches in the system itself and/or produce a novel vertical contribution
and 0.20 is a good substrate for that.

Though this was unfortunately a missed opportunity for a good showing for HBase in particular,
the keynote in general was a well formulated introduction of the emerging area of "cloud scale"
storage / "nosql" systems to the largest elite gathering of database and data processing researchers
in the world. The presentation was importantly also a call for participation in the future
development and directions of the new and growing "nosql" constellation. Such participation,
whether it is specific involvement with the HBase project or not, would be and is most welcome
as the problems of serving data at very large scale under "cloud" constraints is an area of
both significant challenge and significant promise. HBase like other projects in this area
are in an early stage of development. They cover the use cases of their creators but, as answers
to the larger set of problems, they are not -- that space is untapped and only waiting for
creativity and effort. I
 think I can speak for HBase in particular, we welcome this and would be pleased to assist
at every opportunity. 

    - Andy

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message