lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Desilets, Alain" <Alain.Desil...@nrc-cnrc.gc.ca>
Subject RE: [lucy-user] Avoid duplicate docs in hits?
Date Tue, 28 Aug 2012 19:47:48 GMT
When I started working with Lucy, I expected it to work like a kind of relational DB table,
where certain fields of an index acted like "unique keys" for the records (which in turn would
guarantee that there can be only one record with a given key). But that's not how Lucy is
designed.

So in the end, we implemented our own class LucyIndex, which add this kind of functionality.
When defnining the schema for the index, you indicate which field will act as the key. From
then on, if you add a record whose key value is the same as that of an existing record, then
the class will erase the existing record, and replace it by the one you provide. It wasn't
hard to implement, but I am surprised this kind of functionality is not standard in Lucy.

Alain


-----Original Message-----
From: Peter Karman [mailto:peter@peknet.com] 
Sent: Wednesday, August 15, 2012 11:41 AM
To: user@lucy.apache.org
Subject: Re: [lucy-user] Avoid duplicate docs in hits?

On 8/15/12 2:49 AM, Lee Goddard wrote:
> HI
>
> Just started playing with Lucy, but I can't find a way to prevent 
> duplicate hits being returned.

Lucy won't return duplicate hits. But it also won't prevent you from inserting duplicate documents,
for some value of "duplicate".

A small, reproducable example is best if you are looking for help.

--
Peter Karman  .  http://peknet.com/  .  peter@peknet.com
Mime
View raw message