giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Brickley (Created) (JIRA)" <>
Subject [jira] [Created] (GIRAPH-170) Workflow for loading RDF graph data into Giraph
Date Thu, 05 Apr 2012 18:38:26 GMT
Workflow for loading RDF graph data into Giraph

                 Key: GIRAPH-170
             Project: Giraph
          Issue Type: New Feature
            Reporter: Dan Brickley
            Priority: Minor

W3C RDF provides a family of Web standards for exchanging graph-based data. RDF uses sets
of simple binary relationships, labeling nodes and links with Web identifiers (URIs). Many
public datasets are available as RDF, including the "Linked Data" cloud (see
). Many such datasets are listed at

RDF has several standard exchange syntaxes. The oldest is RDF/XML. A simple line-oriented
format is N-Triples. A format aligned with RDF's SPARQL query language is Turtle. Apache Jena
and Any23 provide software to handle all these;

This JIRA leaves open the strategy for loading RDF data into Giraph. There are various possibilites,
including exploitation of intermediate Hadoop-friendly stores, or pre-processing with e.g.
Pig-based tools into a more Giraph-friendly form, or writing custom loaders. Even a HOWTO
document or implementor notes here would be an advance on the current state of the art. The
BluePrints Graph API (Gremlin etc.) has also been aligned with various RDF datasources.

Related topics: multigraphs touches on the
issue (since we can't currently easily represent fully general RDF graphs since two nodes
might be connected by more than one typed edge). Even without multigraphs it ought to be possible
to bring RDF-sourced data
into Giraph, e.g. perhaps some app is only interested in say the Movies + People subset of
a big RDF collection.

>From Avery in email: "a helper VertexInputFormat (and maybe VertexOutputFormat) would
certainly [despite GIRAPH-141] still help"

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message