mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Split data to training and test in CF
Date Thu, 05 Apr 2012 05:23:13 GMT
After you get your CF tuned by doing training/test splits, you then
try it against your known good data. The known good data is a second
kind of test data.

You assume that your input data and your "gold standard" data have the
same statistical profile. If the performance against the sampled test
data and the known good test data are different, then you might be
comparing two different kinds of data.

On Tue, Apr 3, 2012 at 9:07 PM, ziad kamel <ziad.kamel25@gmail.com> wrote:
> Hi ! I understand that reason behind splitting data to training and
> test during classification and clustering , but why we need to do that
> during CF ? We just select top X from list and and compare with good
> recommendations .
> Thanks



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message