lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Finding words not followed by other words
Date Sat, 12 Jul 2014 14:45:05 GMT
Hi Michael,

I haven't executed this yet, but can you try this:

SpanNotQuery(SpanNearQuery("George Washington"), SpanNearQuery("George Washington Carver"))

Koji
-- 
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

(2014/07/11 23:20), Michael Ryan wrote:
> I'm trying to solve the following problem...
>
> I have 3 documents that contain the following contents:
> 1: "George Washington Carver blah blah blah."
> 2: "George Washington blah blah blah."
> 3: "George Washington Carver blah blah blah. George Washington blah blah blah."
>
> I want to create a query that matches documents 2 and 3, but not 1. That is, I want to
find documents that mention "George Washington". It's okay if they also mention "George Washington
Carver", but I don't want documents that only mention "George Washington Carver". So simply
doing something like this does not solve it:
> "George Washington" NOT "George Washington Carver"
>
> Is there a Query type that does this out of the box? I've looked at the various types
of span queries, but none of them seem to do this. I think it should be theoretically possible
given the position data that Lucene stores...
>
> -Michael
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message