hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Possible Aggregator Problem
Date Wed, 24 Apr 2013 08:43:32 GMT
Steven,

Could you please try your application again with
http://people.apache.org/~edwardyoon/dist/test/ and feedback me
whether it works correctly as you expected?

On Wed, Apr 24, 2013 at 4:53 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Thanks for your report. It could be a bug. I'll have a look at it now.
>
> On Wed, Apr 24, 2013 at 4:48 PM, Steven van Beelen <smcvbeelen@gmail.com> wrote:
>> I'm running version 0.6.1.
>> Looking at the results I found through testing,
>>
>>   public void aggregateVertex(M lastValue, Vertex<V, E, M> v)
>>
>> doesn't seem to be the problem. Both 'aggregate(v, v.getValue())' and
>> 'aggregate(v, lastValue, v.getValue())'
>> are called correctly and work on the same values.
>>
>> However, when finalizing through 'finalizeAggregation()' in the
>> 'public void doMasterAggregation(MapWritable updatedCnt)' method,
>>
>> the value aggregated upon by 'aggregate(v, lastValue, v.getValue())'
>> is lost. That is what happens at me.
>>
>> Could it be that I'm implementing the aggregate methods incorrect?
>>
>> In the end however, I can not find a direct bug in TRUNK[1], although
>> it is not clear to me what/which part of the code was changed through
>> the ticket on JIRA.
>>
>>
>>
>>
>> On Wed, Apr 24, 2013 at 2:41 AM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>>
>>> I found the ticket on JIRA -
>>> https://issues.apache.org/jira/browse/HAMA-659
>>>
>>> And it seems already fixed.
>>>
>>> What is your version of hama here? and can you find some bug in TRUNK[1]?
>>>
>>> 1.
>>> http://svn.apache.org/repos/asf/hama/trunk/graph/src/main/java/org/apache/hama/graph/AggregationRunner.java
>>>
>>> On Tue, Apr 23, 2013 at 9:41 PM, Steven van Beelen <smcvbeelen@gmail.com>
>>> wrote:
>>> > Could anyone tell me if I'm correct concerning the possible problem I
>>> > posted and replied on in the previous two emails?
>>> >
>>> >
>>> > On Wed, Apr 17, 2013 at 5:08 PM, Steven van Beelen <smcvbeelen@gmail.com
>>> >wrote:
>>> >
>>> >> Additionally, I found this in the mail archives:
>>> >>
>>> >>
>>> http://mail-archives.apache.org/mod_mbox/hama-user/201210.mbox/%3CCAJ-=ys=W8F5W4aduV+=+yfsvh41xSa22-wNqQRKapadZD+QBag@mail.gmail.com%3E
>>> >> This actually exactly covers my point. Is this still considered as a
>>> bug,
>>> >> calling two different aggregate functions in a row?
>>> >>
>>> >>
>>> >> On Wed, Apr 17, 2013 at 2:35 PM, Steven van Beelen <
>>> smcvbeelen@gmail.com>wrote:
>>> >>
>>> >>> Hi Thomas,
>>> >>>
>>> >>> Then I guess I did not explain myself clearly.
>>> >>> What you describe is indeed how I think of the AverageAggregator
to
>>> work,
>>> >>> but if I use the AverageAggregator in my own PageRank implementation
it
>>> >>> does not return
>>> >>> the average of all absolute differences but just the average of
the sum
>>> >>> of all values.
>>> >>>
>>> >>> The (very) small example graph I use has only five vertices, were
the
>>> sum
>>> >>> of every vertice it's value is always 1.0.
>>> >>> When I use the AverageAggregator it will always return 0.2 when
calling
>>> >>> the getLastAggregatedValue method.
>>> >>> It shouldn't do that right?
>>> >>>
>>> >>>
>>> >>> On Wed, Apr 17, 2013 at 1:18 PM, Thomas Jungblut <
>>> >>> thomas.jungblut@gmail.com> wrote:
>>> >>>
>>> >>>> Hi Steven,
>>> >>>>
>>> >>>> the AverageAggregator is used to determine the average of all
absolute
>>> >>>> differences between old pagerank and new pagerank for every
vertex.
>>> >>>> This is documented like it should behave in the javadoc of the
given
>>> >>>> classes and suffices to track if pagerank values have yet converged
or
>>> >>>> not.
>>> >>>>
>>> >>>> What you describe is a perfectly valid way to track the pagerank
>>> >>>> difference
>>> >>>> throughout all supersteps. But this is not how (imho) the
>>> >>>> AverageAggregator
>>> >>>> should behave, so you have to write your own.
>>> >>>>
>>> >>>>
>>> >>>> 2013/4/17 Steven van Beelen <smcvbeelen@gmail.com>
>>> >>>>
>>> >>>> > The values in my case are the DoubleWritable values each
vertice has
>>> >>>> and
>>> >>>> > the aggregators aggregate on.
>>> >>>> > My tests showed that, when the aggregator was set to
>>> >>>> AverageAggregator, the
>>> >>>> > average of all the vertice values from the past compute
step were
>>> >>>> returned.
>>> >>>> > Actually, AverageAggregator should return the average difference
of
>>> >>>> all the
>>> >>>> > old-new value pairs of every vertice instead of the mean.
>>> >>>> > The average difference is then used to check whether convergence
is
>>> >>>> > reached, which is relevant for all task ofcourse.
>>> >>>> >
>>> >>>> > Hence, the convergence point, for which the Aggregator
is used, will
>>> >>>> not be
>>> >>>> > reached.
>>> >>>> > This thus makes it so that the algorithm will just run
the maximum
>>> >>>> number
>>> >>>> > of iterations set (30 iterations on the PageRank example)
in every
>>> >>>> case.
>>> >>>> > I experienced the same with my own PageRank implementation.
>>> >>>> >
>>> >>>> > I think it has something to do with the finalizeAggregation
step
>>> taken.
>>> >>>> > Next to that, both the 'aggregate(VERTEX vertex, M value)'
and
>>> >>>> > 'aggregate(VERTEX vertex, M oldValue, M newValue)' methods
are
>>> called
>>> >>>> every
>>> >>>> > time, were one would think only the second (with old/new
values)
>>> would
>>> >>>> > suffice.
>>> >>>> > Because of this, the global variable 'absoluteDifference'
in the
>>> >>>> > 'AbsDiffAggregator' class is overwriten/overruled by the
first
>>> >>>> aggregate.
>>> >>>> > Additionally, if one would make its own Aggregation class
in the
>>> same
>>> >>>> > fashion as AbsDiffAggregator and AverageAggregator, but
leave out
>>> the
>>> >>>> > 'aggregate(VERTEX vertex, M value)', my output turned out
to be
>>> 0.0000
>>> >>>> > every time.
>>> >>>> >
>>> >>>> > I hope I made myself clear.
>>> >>>> > Regards
>>> >>>> >
>>> >>>> >
>>> >>>> > On Wed, Apr 17, 2013 at 11:57 AM, Edward J. Yoon <
>>> >>>> edwardyoon@apache.org
>>> >>>> > >wrote:
>>> >>>> >
>>> >>>> > > Thanks for your report.
>>> >>>> > >
>>> >>>> > > What's the meaning of 'all the values'? Please give
me more
>>> details
>>> >>>> > > about your problem.
>>> >>>> > >
>>> >>>> > > I didn't look at 'dangling links & aggregators'
part of PageRank
>>> >>>> > > example closely, but I think there's no bug. Aggregators
is just
>>> used
>>> >>>> > > for global communication. For example, finding max
value[1] can be
>>> >>>> > > done in only one iteration using MaxValueAggregator.
>>> >>>> > >
>>> >>>> > > 1.
>>> >>>> http://cdn.dejanseo.com.au/wp-content/uploads/2011/06/supersteps.png
>>> >>>> > >
>>> >>>> > > On Wed, Apr 17, 2013 at 6:27 PM, Steven van Beelen
<
>>> >>>> smcvbeelen@gmail.com
>>> >>>> > >
>>> >>>> > > wrote:
>>> >>>> > > > Hello,
>>> >>>> > > >
>>> >>>> > > > I'm creating my own pagerank in hama for a testing
and I think I
>>> >>>> found
>>> >>>> > a
>>> >>>> > > > problem with the AverageAggregator. I'm not sure
if it is me or
>>> >>>> the the
>>> >>>> > > > AverageAggregator class in general, but I believe
it just
>>> returns
>>> >>>> the
>>> >>>> > > mean
>>> >>>> > > > of all the values instead of the average difference
between the
>>> >>>> old and
>>> >>>> > > new
>>> >>>> > > > value as intended.
>>> >>>> > > >
>>> >>>> > > > For testing, I created my own AbsDiffAggregator
and
>>> >>>> AverageAggregator
>>> >>>> > > > classes, using FloatWritable instead of DoubleWritables.
The
>>> same
>>> >>>> > problem
>>> >>>> > > > still occured: I got a mean of all the values
in the graph
>>> instead
>>> >>>> of
>>> >>>> > an
>>> >>>> > > > average difference.
>>> >>>> > > >
>>> >>>> > > > Could someone tell me if I'm doing something
wrong or what I
>>> should
>>> >>>> > > provide
>>> >>>> > > > to better explain my problem?
>>> >>>> > > >
>>> >>>> > > > Regards,
>>> >>>> > > > Steven van Beelen, Vrije Universiteit of Amsterdam
>>> >>>> > >
>>> >>>> > >
>>> >>>> > >
>>> >>>> > > --
>>> >>>> > > Best Regards, Edward J. Yoon
>>> >>>> > > @eddieyoon
>>> >>>> > >
>>> >>>> >
>>> >>>>
>>> >>>
>>> >>>
>>> >>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message