giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo" <>
Subject Review Request: GIRAPH-498: We should check input splits status from zookeeeper once per worker, not once per split thread
Date Tue, 05 Feb 2013 05:42:07 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for giraph.


When using a lot of workers and a lot of input split threads, checking that all input splits
are finished after the reading is done takes a long time, since we check every input split
once per thread.

This addresses bug GIRAPH-498.


  giraph-core/src/main/java/org/apache/giraph/conf/ 796047d 
  giraph-core/src/main/java/org/apache/giraph/worker/ f542344 
  giraph-core/src/main/java/org/apache/giraph/worker/ 7d40dfb

  giraph-core/src/main/java/org/apache/giraph/worker/ 1adcd73

  giraph-core/src/main/java/org/apache/giraph/worker/ bfaefd2

  giraph-core/src/main/java/org/apache/giraph/worker/ d09ca2b 
  giraph-core/src/main/java/org/apache/giraph/worker/ PRE-CREATION

  giraph-core/src/main/java/org/apache/giraph/worker/ a4f98e1

  giraph-core/src/test/java/org/apache/giraph/ 987f51c 
  giraph-hbase/.graph.csv.crc PRE-CREATION 
  giraph-hbase/graph.csv PRE-CREATION 



mvn clean verify

Real application, using 200 workers and 20 input threads:
- trunk - about 560s for input split threads to finish, 720s for input superstep
- with this patch - about 310s for input split threads to finish, 500s for input superstep


Maja Kabiljo

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message