giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hassan Eslami (JIRA)" <>
Subject [jira] [Created] (GIRAPH-1066) Functional adaptive out-of-core mechanism
Date Thu, 19 May 2016 02:52:12 GMT
Hassan Eslami created GIRAPH-1066:

             Summary: Functional adaptive out-of-core mechanism
                 Key: GIRAPH-1066
             Project: Giraph
          Issue Type: New Feature
          Components: bsp, graph
            Reporter: Hassan Eslami
            Assignee: Hassan Eslami

In this JIRA we propose the following contributions to the out-of-core mechanism:
• A simpler API is provided to try various out-of-core policies using the basic infrastructure
proposed in GIRAPH-1048. This new API helps developers of out-of-core policies to only focus
on the out-of-core logic, rather than the complications in multi-threading, disk interactions,
etc. The policy logic is abstracted out as much as possible to make it as simple as possible
to develop and try other out-of-core policies.
• Two adaptive out-of-core policies are implemented using the proposed API. One is based
on few recent GC behaviors, and the other is based on some user-defined thresholds to control
the memory pressure. With the adaptive out-of-core policies, the job automatically uses secondary
storage devices in case the data cannot fit into memory. Also, if at some point in the computation
the memory pressure goes down, the spilled data to secondary storage will be automatically
loaded to memory again.
• The out-of-core infrastructure is integrated with message flow control proposed in GIRAPH-1027.
Using credit-based flow control, an out-of-core policy can predict the amount of memory usage
by messages in a near future, hence the policy can have a fine control over messages and their
memory footprint.
• A new feature, called data generation tethering, is also added. This feature let the out-of-core
policy to decide how many threads (input/compute) should be active at each moment, indirectly
controlling the rate of data generation, and in turn, controlling the memory footprint of
graph data.
With this JIRA landed, we will have a full-functional out-of-core infrastructure preventing
any reasonable job to fail due to OOM.

This message was sent by Atlassian JIRA

View raw message