giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (Commented) (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-96) Support for Graphs with Huge adjacency lists
Date Thu, 17 Nov 2011 15:00:58 GMT


Arun Suresh commented on GIRAPH-96:

Looks like Claudio beat me to a similar suggestion [GIRAPH-94|]

My proposal was more for a standard means of storing vertex/adjacency list information. The
Giraph framework would handle the storage and would expose APIs which the Vertex reader can
use to store the information as it reads the graph. The user would then not be required to
subclass a Vertex class and implement the initialize() method. All adjacency list/vertex manipulation
would go thru the common data store.
> Support for Graphs with Huge adjacency lists
> --------------------------------------------
>                 Key: GIRAPH-96
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Arun Suresh
> Currently the vertex initialize() method is passed the complete adjacency list as a HashMap.
All the current concrete implementations of Vertex iterate over the adjacency list and recreate
new Data Structures within the Vertex instance to hold/manipulate the adjacency list. This
would seize to be feasible once the size of the adjacency list becomes really huge.
> I propose storing the adjacency list and all vertex information (and incoming messages
?) in a distributed data store such as HBase. The adjacency list can be lazily loaded via
HBase Scans. I was thinking of an HBase schema where the row Id is a concatenation of VertexID+OutboundVertexId
with a single column containing the edge.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message