giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudio Martella" <claudio.marte...@gmail.com>
Subject Re: Review Request 13492: LinkRank implementation with Giraph
Date Thu, 29 Aug 2013 15:58:25 GMT


> On Aug. 29, 2013, 2:59 p.m., Ahmet Emre Aladag wrote:
> > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexWorkerContext.java,
line 28
> > <https://reviews.apache.org/r/13492/diff/4/?file=346224#file346224line28>
> >
> >     I couldn't understand this. Can Master do postSuperstep operations?

The master is executed before the workers, so you can execute in the master what the workers
would execute in their postSuperstep, for example.


> On Aug. 29, 2013, 2:59 p.m., Ahmet Emre Aladag wrote:
> > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexWorkerContext.java,
line 43
> > <https://reviews.apache.org/r/13492/diff/4/?file=346224#file346224line43>
> >
> >     this was to use maxSteps in postSuperstep method. Moreover, if I use presupserstep,
it will read this variable at each superstep, not once like here.

as i said, i think you should be able to do all the things you do in the workercontext by
moving stuff to the master, and do some things in the Computation.preSuperstep(), such as
reading maxSteps as it is used in LinkRankComputation (not the one used in the WC postSuperstep()).


- Claudio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13492/#review25739
-----------------------------------------------------------


On Aug. 29, 2013, 2:59 p.m., Ahmet Emre Aladag wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13492/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2013, 2:59 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-729
>     https://issues.apache.org/jira/browse/GIRAPH-729
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> Currently, Nutch 2.x lacks LinkRank (a variant of PageRank). Adding a module for Nutch
including LinkRank and other possible ranking algorithms would be useful for Apache Community.
This module can be used by Nutch 1.x and other apps as well.
> 
> Attached you can find my patch. It includes:
> 
> * I/O formats (URL Text-URL Text edges, URL Text nodes) for reading from HDFS and HBase,

> * Self-link and duplicate-link elimination
> * LinkRank computation (10 iterations by default).
> * Cumulative distribution normalization
> 
> 
> Diffs
> -----
> 
>   giraph-nutch/pom.xml PRE-CREATION 
>   giraph-nutch/src/main/assembly/compile.xml PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankComputation.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexMasterCompute.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexWorkerContext.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/HostRankVertexFilter.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/LinkRankEdgeFilter.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/LinkRankVertexFilter.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/package-info.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankEdgeInputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexInputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexOutputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexUniformInputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2HostInputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2HostOutputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageInputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageOutputFormat.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/package-info.java
PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/package-info.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/package-info.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/package-info.java PRE-CREATION 
>   giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/NutchUtil.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringDoublePair.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringFloatPair.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringStringPair.java PRE-CREATION

>   giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/package-info.java PRE-CREATION

>   giraph-nutch/src/test/java/org/apache/giraph/nutch/HostRankHBaseTest.java PRE-CREATION

>   giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankComputationTest.java PRE-CREATION

>   giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankHBaseTest.java PRE-CREATION

>   giraph-nutch/src/test/java/org/apache/giraph/nutch/package-info.java PRE-CREATION 
>   pom.xml 41b6bb1 
> 
> Diff: https://reviews.apache.org/r/13492/diff/
> 
> 
> Testing
> -------
> 
> * Unittests for computation on HDFS and HBase.
> 
> 
> Thanks,
> 
> Ahmet Emre Aladag
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message