hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Update of "Hamburg" by udanax
Date Mon, 20 Jul 2009 08:49:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The following page has been changed by udanax:

+ = Rationale =
  == Motivation ==
  The MapReduce (M/R) programming model is inappropriate to problems based on data where each
portion depends on many other potions and their relations are very complicated. It is because
these problems cause as follows:
   * limit to assigning one reducer
@@ -10, +11 @@

  These problems are very common in many areas; especially, many graph problems are exemplary.

- TODO - write description of an example.
  Therefore, we try to propose a new programming model, named Hamburg. The main objective
of Hamburg is to support well the problems based on data having complexity dependency one
another. This page is an initial work of our proposal.
  == Goal ==
   * Follow scalability concept of shared-nothing architecture
   * Support a simple programming model to compute complex relations such as, graph data.
- == Hamburg ==
+ = Hamburg =
  Hambrug is an alternative to M/R programming model. It is based on bulk synchronization
parallel (BSP) model. Like M/R, Hambrug takes advantages from shared-nothing architecture
(SN), so I expect that it will also show scalablity without almost degradation of performance
as the number of participant nodes increases.
  A Hamburg based on BSP computation step consists of three sub steps:
   * Computation on data that reside in local storage; it is similar to map operation in M/R.
@@ -27, +26 @@

  The main difference between Hamburg and M/R is that Hamburg does not make intermediate data
aggregate into reducer. Instead, each computation node communicates only necessary data into
one another. 
  It will be efficient if total communicated data is smaller then intermediate data to be
aggregated into reducers.
+ Let's see more detail in the diagram of computing method of Hamburg based on BSP model.
+ [http://lh4.ggpht.com/_DBxyBGtfa3g/SmQUYTHWooI/AAAAAAAABmk/cFVlLCdLVHE/s800/figure1.PNG]
+ Each worker will process the split that is locally on that machine. And then, We can do
bulk synchronization using collected communication data. The 'Computation' and 'Bulk synchronization'
can be performed iteratively, Data for synchronization can be compressed to reduce network
- === Initial contributors ===
+ == Initial contributors ==
   * Edward J. (edwardyoon AT apache.org)
   * Hyunsik Choi (hyunsik.choi AT gmail.com)
  Any volunteers are welcome.
+ == Implementation ==

View raw message