lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: New type of proximity/fuzzy search
Date Wed, 31 Aug 2016 19:40:56 GMT
Doh, sorry, Uwe, didn't see your response first.

Scratch SpanOr, take a look at SpanNear.  This would be a great capability to have!

-----Original Message-----
From: Allison, Timothy B. 
Sent: Wednesday, August 31, 2016 3:30 PM
To: java-user@lucene.apache.org
Subject: RE: New type of proximity/fuzzy search

Unfortunately, that does require a new type of query.  As you probably know, you can do the
"at least" (minimum number should match) with regular BooleanQueries, but you can't yet do
the "at least" with SpanQuery.  You might want to look at modifying the SpanOrQuery to get
this functionality.  It would be a great capability to have.  Perhaps open an issue and submit
a patch?

-----Original Message-----
From: Saar Carmi [mailto:saarcarmi@gmail.com] 
Sent: Tuesday, August 30, 2016 11:03 PM
To: java-user@lucene.apache.org
Subject: New type of proximity/fuzzy search

Hi
I will appreciate some guidance for implementing the following type of query.

Given a set of search terms (t1, t2, t3, ti), return all documents where in a sequence of
x=10 tokens at least c=3 of the search terms appear within the sequence

So for example the following document matches the search (expand, discount, file, search,
lookup)

"Many of us rely on Windows Search to find files and launch programs, but searching for text
within files is limited to specific filetypes by default. Here’s how you can *expand *your
*search *to include other text based *files*."

Within the sequence of the last 10 words of the document the expand, files, and search terms
appear so there is a match.

Does any documentation exist on adding new types of queries into the Luence engine?

Saar
Mime
View raw message