Hi Lance,
IMHO I think the best way to compare how much information are you loosing from
your derivative function is to perform a crossvalidation scheme both
in the original
data set and on the derivative data set.
But be sure to compare the same validation set of the two sets (the original and
the derivative), I mean if you use and 80%20% for training/validation
with a 5 crossvalidation
scheme, be sure you are comparing the same subset of your two sets.
Regards,
Federico
2010/10/22 Sean Owen <srowen@gmail.com>:
> Yah I still think heldout data is the best thing, if you want to use this
> builtin evaluation mechanism. Hold out the same data from both models and
> run the same test.
>
> There is another approach which doesn't necessarily require heldout data.
> On the original, full model, just compute recommendations for any users you
> like. Assume these are "correct". Then do the same for the derived model.
>
> It will return to you estimated preferences in both cases. You could use the
> deltas as a measure of "error" (unless your derived model has quite a
> different rating space).
>
> Or simply use the difference in rankings  compute some metric that
> penalizes having recommendations in different places in the ordering.
>
> I'll say I don't know which of these is most mathematically sound.
> Interpreting the results may be hard. But, any of these should give a notion
> of "better" and "worse".
>
>
> Assuming the original model's recommendations are "correct" is a reasonably
> big one. For example, the whole point of an SVD recommender is to modify the
> model (reduce its dimension really) in order to be able to recommend items
> that should be recommended, but weren't before due to model sparseness.
> There, transforming the data in theory gives better results. That it's
> different doesn't mean worse necessarily.
>
> But maybe that's not an issue for your use case, don't know.
>
>
> On Fri, Oct 22, 2010 at 5:39 AM, Lance Norskog <goksron@gmail.com> wrote:
>
>> Here is my use case: I have two data models.
>> 1) the original data, for example GroupLens
>> 2) the derivative. This is a second data model which is derived from
>> the original. It is made with a oneway function from the master.
>>
>> I wish to measure how much information is lost in the derivation
>> function. There is some entropy, so therefore the derived data model
>> cannot supply recommendations as good as the original data. But how
>> much worse?
>>
>> My naive method is to make recommendations using the master model, and
>> the derived model, and compare them. If the recommendations from the
>> derived model are, say, 90% as good as from the original data, then
>> the derivation function is ok.
>>
>> Now, obviously, the gold standard for recommendations is the data in
>> the original model. So, I make recommendations from the original, and
>> the derived, from the user/item prefs given in the original data. I
>> don't really care about what the user gave as preferences: I want to
>> know what the recommender algorithm itself thinks. But the
>> recommenders just parrot back the data model instead of giving me
>> their own opinion. Thus, the point of this whole thread. But how
>> recommender algorithms work is a side issue. I'm trying to use them as
>> an indirect measurement of something else.
>>
>> What is another way to test what I'm trying to test? What is another
>> way to evaluate the quality of my derivation function?
>>
>> On Wed, Oct 20, 2010 at 11:41 PM, Sebastian Schelter <ssc@apache.org>
>> wrote:
>> > Hi Lance,
>> >
>> > When evaluating a recommender you should split your dataset in a training
>> > and test part. Only data from the training part should be included in
>> your
>> > DataModel and you only measure the accuracy of predicting ratings that
>> are
>> > included in the test part (which is not known by your recommender). If
>> you
>> > structure things this way, the current implementation should work fine
>> for
>> > you.
>> >
>> > sebastian
>> >
>> > On 21.10.2010 04:56, Lance Norskog wrote:
>> >>
>> >> Since this is Recommender day, here is another kvetch:
>> >>
>> >> The recommender implementations with algorithms all do this in
>> >> Recommender.estimatePreference():
>> >> public float estimatePreference(long userID, long itemID) throws
>> >> TasteException {
>> >> DataModel model = getDataModel();
>> >> Float actualPref = model.getPreferenceValue(userID, itemID);
>> >> if (actualPref != null) {
>> >> return actualPref;
>> >> }
>> >> return doEstimatePreference(userID, itemID);
>> >> }
>> >>
>> >> Meaning: "if I told you something, just parrot it back to me."
>> >> Otherwise, make a guess.
>> >>
>> >> I am doing headtohead comparisons of the dataModel preferences v.s.
>> >> the Recommender. This code makes it impossible to directly compare
>> >> what the recommender thinks v.s. the actual preference. If I wanted to
>> >> know what I told it, I already know that. I want to know what the
>> >> recommender thinks.
>> >>
>> >> If this design decision is something y'all have argued about and
>> >> settled on, never mind. If it is just something that seemed like a
>> >> good idea at the time, can we change the recommenders, and the
>> >> Recommender "contract", to always use their own algorithm?
>> >>
>> >>
>> >
>> >
>>
>>
>>
>> 
>> Lance Norskog
>> goksron@gmail.com
>>
>
