mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 张玉东 <zhangyud...@vancl.cn>
Subject Re: why use the job 'itemIDIndex' to convert the itemid to index?
Date Tue, 20 Sep 2011 10:52:54 GMT
Thanks, I understand. I am not familiar with the algorithms of non-distributed method.

-----邮件原件-----
发件人: Sean Owen [mailto:srowen@gmail.com] 
发送时间: 2011年9月20日 18:46
收件人: user@mahout.apache.org
主题: Re: why use the job 'itemIDIndex' to convert the itemid to index?

It is necessary. We want to support input where IDs are possibly
64-bit longs, for consistency with the non-distributed code.
But, 64-bit values are too large to be used as indexes into a Vector.
So they are hashed and then un-hashed by a dictionary lookup.

On Tue, Sep 20, 2011 at 11:44 AM, 张玉东 <zhangyudong@vancl.cn> wrote:
> Yes, the probability of collision is quite small. But I mean it is not necessary to do
this step, I can not find any help of it to the following computations.
>
Mime
View raw message