giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <>
Subject Out-of-core messaging
Date Thu, 03 May 2012 13:44:06 GMT

I'd like to ask whether someone is currently working on out-of-core
messaging for Giraph (e.g. by spilling messages to disk in case of
memory pressure).

I ran some experiments with Giraph on a small 6-machine cluster and got
really nice results for smaller datasets such as the wikipedia pagelink
graph (6M vertices, ~250M edges in its undirected version).

For larger graphs with a even more skewed degree distribution such as
the twitter follower graph from [1], Giraph crashes in the first
superstep unfortunately. My colleagues observed the same, when they ran
benchmarks of Giraph against the Stratosphere system [2], where Giraph
did kind of well for small datasets, but again crashed for larger ones...

I think the lack of out-of-core messages is currently the biggest
obstacle to recommending people to test Giraph in production use.



View raw message