giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-314) Implement better message grouping to improve performance in SimpleTriangleClosingVertex
Date Wed, 05 Sep 2012 18:11:09 GMT


Eli Reisman commented on GIRAPH-314:

Yeah the idea is to not so much gain any performance but to let us get past degree(V)^2 messages
being sent without crashing. Neither the disk-backed solutions nor in memory solutions are
working so far as the messages just pile up so quickly at scale. So the "performance gain"
is just surviving a large run at all.

This is sort of in prep for stage 2 where we assume we know some things about messages sent
through sendMessageToAllEdges() calls (namely that there will be a lot of unneeded duplication
as things stand now) and handle those differently through the whole pipeline. Even then to
run at this at the scale I'm trying to, the amortization option has to be there also, so this
is just getting a scalable example up and running for testing purposes.

I can fix the hashCode and check the javadoc, thanks again. We're amortizing the cost of all
those messages over the time. So I guess its more of a trade off than an amortization. But
then the un-amortized cost is crashing the job, so maybe it is...?

> Implement better message grouping to improve performance in SimpleTriangleClosingVertex
> ---------------------------------------------------------------------------------------
>                 Key: GIRAPH-314
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>            Priority: Trivial
>             Fix For: 0.2.0
>         Attachments: GIRAPH-314-1.patch, GIRAPH-314-2.patch
> After running SimpleTriangleClosingVertex at scale I'm thinking the sendMessageToAllEdges()
is pretty in the code, but its not a good idea in practice since each vertex V sends degree(V)^2
messages right in the first superset in this algorithm. Could do something with a combiner
etc. but just grouping messages by hand at the application level by using IntArrayListWritable
again does the trick fine.
> Probably should have just done it this way before, but sendMessageToAllEdges() looked
so nice. Sigh. Changed unit tests to reflect this new approach, passes mvn verify and cluster,

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message