mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: one vector or many vectors?
Date Thu, 01 Nov 2012 16:09:53 GMT
Your mileage will vary.

It is often helpful to classify small parts of large articles and then
somehow deal with these multiple classifications at the full document level.

Sometimes it is not helpful, especially if the small parts get too small.

Try it both ways.  My tendency is to prefer to classify book-sized things
at a level smaller than a chapter and sometimes as small as a paragraph.
 Going below the paragraph level is usually bad.

On Thu, Nov 1, 2012 at 3:23 AM, dennis zhuang <> wrote:

> Hi,all
>    I am using sgd classifier for our articles classification.I want to
> train a new model,but there is a problem.I can provide the learner a large
> article or some small articles, but i extract only one vector for one
> article.Then i don't know is  there any difference between one vector and
> many vectors for learner when training? Should i provide the learner one
> large article or many small articles? I can't find any documents about
> this,can anybody help me?Thanks.
> --
> 庄晓丹
> Email:
> Site: 
> Twitter:      @killme2008

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message