lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Duplicate Hits
Date Tue, 01 Feb 2005 15:14:54 GMT

On Feb 1, 2005, at 9:49 AM, Jerry Jalenak wrote:
> Given Erik's response of 'don't put duplicate documents in the index', 
> how
> can I accomplish this in the IndexWriter?

As John said - you'll have to come up with some way of knowing whether 
you should index or not.  For example, when dealing with filesystem 
files, the Ant <index> task (in the sandbox) checks last modified date 
and only indexes new files.

Using a unique id on your data (primary key from a DB, URL from web 
pages, etc) is generally what people use for this.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message