lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Stor <>
Subject RE: Document Clustering
Date Tue, 11 Nov 2003 19:05:30 GMT
Stefan Groschupf wrote:
> Hi,
> > How is document clustering different/related to text categorization?
> Clustering: try to find own categories and put documents that match
> in it. You group all documents with minimal distance together.

Would I be correct to say that you have to define a "distance threshold"
parameter in order to define when to build a new category for a certain

> Classification: you have already categories and samples for
> it, that help you to match other documents.
> You calculate document distances to the existing categories
> and put it in the category with smallest distance.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message