hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jürgen Jakobitsch <jakobits...@punkt.at>
Subject Re: Using SPARQL against HBase
Date Thu, 01 Apr 2010 20:12:43 GMT
hi,

this sounds very interesting to me, i'm currently fiddling
around with a suitable row and column setup for triples.

i'm about to implement openrdf's sail api for hbase (i just did 
a lucene quad store implementation which is superfast a scales 
to a couple of hundreds of millions of triples (http://turnguard.com/tuqs)) 
but i'm in my first days of hbase encounters, so my experience
in row column design is manageable.

from my point of view the problem is to really efficiantly store
besides the triples themselves the contexts (named graphs) and
languages of literal.

by the way : i just did a small tablemanager (in beta) that lets
you create htables -> from <- rdf (see http://sourceforge.net/projects/hbasetablemgr/)

i'd be really happy to contribute on the rdf and sparql side,
but certainly could need some help on the hbase table design side.

wkr www.turnguard.com/turnguard



----- Original Message -----
From: "Raffi Basmajian" <rbasmajian@oppenheimerfunds.com>
To: hbase-user@hadoop.apache.org, apurtell@apache.org
Sent: Thursday, April 1, 2010 9:45:59 PM
Subject: RE: Using SPARQL against HBase


This is an interesting article from a few guys over at BBN/Raytheon. By
storing triples in flat files theu used a custom algorithm, detailed in
the article, to iterate the WHERE clause from a SPARQL query and reduce
the map into the desired result. 

This is very similar to what I need to do; the only difference being
that our data is stored in Hbase tables, not as triples in flat files. 
 

-----Original Message-----
From: Amandeep Khurana [mailto:amansk@gmail.com] 
Sent: Wednesday, March 31, 2010 3:30 PM
To: hbase-user@hadoop.apache.org; apurtell@apache.org
Subject: Re: Using SPARQL against HBase

Why do you need to build an in-memory graph which you would want to
read/write to? You could store the graph in HBase directly. As pointed
out, HBase might not be the best suited for SPARQL queries, but its not
impossible to do. Using the triples, you can form a graph that can be
represented in HBase as an adjacency list. I've stored graphs with
16-17M nodes which was data equivalent to about 600M triples. And this
was on a small cluster and could certainly scale way more than 16M graph
nodes.

In case you are interested in working on SPARQL over HBase, we could
collaborate on it...

-ak


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Mar 31, 2010 at 11:56 AM, Andrew Purtell
<apurtell@apache.org>wrote:

> Hi Raffi,
>
> To read up on fundamentals I suggest Google's BigTable paper:
> http://labs.google.com/papers/bigtable.html
>
> Detail on how HBase implements the BigTable architecture within the 
> Hadoop ecosystem can be found here:
>
>  http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
>  http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
>
> http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-l
> og.html
>
> Hope that helps,
>
>   - Andy
>
> > From: Basmajian, Raffi <rbasmajian@oppenheimerfunds.com>
> > Subject: RE: Using SPARQL against HBase
> > To: hbase-user@hadoop.apache.org, apurtell@apache.org
> > Date: Wednesday, March 31, 2010, 11:42 AM If Hbase can't respond to 
> > SPARQL-like queries, then what type of query language can it respond

> > to? In a traditional RDBMS database one would use SQL; so what is 
> > the counterpart query language with Hbase?
>
>
>
>
>

------------------------------------------------------------------------------
This e-mail transmission may contain information that is proprietary, privileged and/or confidential
and is intended exclusively for the person(s) to whom it is addressed. Any use, copying, retention
or disclosure by any person other than the intended recipient or the intended recipient's
designees is strictly prohibited. If you are not the intended recipient or their designee,
please notify the sender immediately by return e-mail and delete all copies. OppenheimerFunds
may, at its sole discretion, monitor, review, retain and/or disclose the content of all email
communications. 
==============================================================================


-- 
punkt. netServices
______________________________
Jürgen Jakobitsch
Codeography

Lerchenfelder Gürtel 43 Top 5/2
A - 1160 Wien
Tel.: 01 / 897 41 22 - 29
Fax: 01 / 897 41 22 - 22

netServices http://www.punkt.at


Mime
View raw message