giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudio Martella (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-45) Improve the way to keep outgoing messages
Date Tue, 15 Nov 2011 20:38:52 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150753#comment-13150753
] 

Claudio Martella commented on GIRAPH-45:
----------------------------------------

I'm not sure whether sending messages in a streamy way would actually diminish any kind of
memory pressure. As messages need the current superstep to be finished before they can be
consumed, i guess this would just transfer the pressure to the other nodes where they are
transfered to. In a certain scenario, this can actually mean putting more pressure on the
"cumulative memory" consumed (the total memory of the nodes in the cluster). 

Suppose vertex A sends a message to vertex B, C, D and E. B and C are on the same node as
A, D is on another second node and E is on a third node. This means that B and C share the
message sent by A as they live in the same JVM (forget about a semantic where the message
needs to be cloned before they are sent). In this scenario we would have #nodes copies of
the same message overall the cluster. Topology-based graph partitioning would allow these
messages to be sent mostly to vertices living in the same JVM (supposing the communication
pattern of vertices follows graph topology) and would alleviate this problem. 

It feels like keeping messages out-of-core is the best option we have right now and if we
manage to save the messages in the same order vertices they are sent to are processed, we
could even get a scan-based computation that would grant quite a throughput. Does it make
sense?
                
> Improve the way to keep outgoing messages
> -----------------------------------------
>
>                 Key: GIRAPH-45
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-45
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a potential problem
to cause out of memory when the rate of message generation is higher than the rate of message
flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or some approach
to spill messages into disk.
> The below link is Dmitriy's suggestion.
> https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message