giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-314) Implement better message grouping to improve performance in SimpleTriangleClosingVertex
Date Fri, 07 Sep 2012 18:45:08 GMT


Eli Reisman commented on GIRAPH-314:

I'd love for us to move this to GIRAPH-322 and get your input now that the code is up and
you can see what the idea was, I am instrumenting it now so I can see where I messed up the
wiring, but the basic idea is there. 

I didn't implement the combiner option yet in the solution I put up. I would be interested
in trying some more as I am sure you're right that with better tuning (or more expert tuning)
disk spill should be a part of a final solution. I was surprised it didn't work too, it looks
like it should handle exactly this situation. And again, with better tuning perhaps it will.

But I was running real data, and a lot of it. Everyone here has noted the benchmarks are great
for A/B'ing Giraph as it improves  and measuring progress in a sane way, but not great for
comparing with conditions out in the wild. I'm hoping GIRAPH-26 will help close this gap,
but for us the benchmarks have been poor predictors of real performance in our target use-cases.

As for my solution so far, the idea is to reduce the # of partitions to one per worker with
-Dhash.userPartitionCount and then store messages so that only a single object is in memory
at any given time (with one reference per partition destination), and they simply accumulate
destination vertices and flush regularly. The only "real" messages that go out is 1 per partition
that requires a copy of that message (depending on which vertices need it) which will differ
per-message. Again, I tried to make the patch simple and changeable so we can tune this or
improve the idea and try things to see what works best.

The problem I had so far with combiners is they just aggregate messages for one vertex rather
than destinations for one message. In the solution so far I found it easier to just sort of
set this stuff up by hand to happen since we are in a special case where we know something
we can use about the properties of a message sent with sendMessageToAllEdges() and can avoid
some of the object creations and checks along the way. As you said, the place for a combiner
in this scenario if anywhere seems to be on the receiving end.

Now that the "game plan" patch is up, I'll be very interest in ideas and observations. If
this gets any traction, we could then set up a disk spill strategy for these types of messages
that does not re-duplicate them on load (since the message stores are all currently set up
for Giraph's existing Partition -> Vertex -> List<M> paradigm.) Alternately, the
whole exercise might be a waste of time ;) but I have it on good authority this is a route
worth pursuing, so we'll see where it leads.

> Implement better message grouping to improve performance in SimpleTriangleClosingVertex
> ---------------------------------------------------------------------------------------
>                 Key: GIRAPH-314
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>            Priority: Trivial
>             Fix For: 0.2.0
>         Attachments: GIRAPH-314-1.patch, GIRAPH-314-2.patch, GIRAPH-314-3.patch, GIRAPH-314-4.patch
> After running SimpleTriangleClosingVertex at scale I'm thinking the sendMessageToAllEdges()
is pretty in the code, but its not a good idea in practice since each vertex V sends degree(V)^2
messages right in the first superset in this algorithm. Could do something with a combiner
etc. but just grouping messages by hand at the application level by using IntArrayListWritable
again does the trick fine.
> Probably should have just done it this way before, but sendMessageToAllEdges() looked
so nice. Sigh. Changed unit tests to reflect this new approach, passes mvn verify and cluster,

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message