mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Incremental training of Classifier
Date Tue, 29 Dec 2009 00:44:14 GMT
mani,

You are sounding more and more like the poster child for an on-line
classifier.

The idea would be that you would give your classified docs to the system
first for testing, then again for incremental training.  You can use the
results of the test to adjust the learning rate for the incremental
learning.

See the work I have started with MAHOUT-228 for the beginnings of this.  Let
me know where it should go to help with your needs (i.e. what entry points
that you would need).

On Mon, Dec 28, 2009 at 1:33 PM, Mani Kumar <manikumarchauhan@gmail.com>wrote:

> lets talk about bigger numbers e.g. i have more than 1 million docs and i
> get 10k new docs every day out of which 6k is already classified.
>
> Monitoring performance is good but it can be done weekly instead of daily
> just to reduce cost.
>
> I actually wanted to avoid the retraining as much as possible because it
> comes with huge cost for large dataset.
>



-- 
Ted Dunning, CTO
DeepDyve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message