lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Labour <matthieu_lab...@yahoo.com>
Subject How to query for similar documents before indexing
Date Mon, 10 May 2010 20:39:34 GMT
Hi

I want to implement the following logic:

Before I index a new document into the index, I want to check if there are already documents
in the index with similar content to the content of the document about to be inserted. If
the request returns 1 or more documents, then I don't want to insert the document.

What is the best way to achieve the above functionality ?

I read about Fuzzy searches in logic. But can I really build a request such as 
mydoc.title:wordexample~ AND mydoc.content:( all the content words)~0.9 ?

Thank you for your help




      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message