lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksander M. Stensby" <>
Subject Re: Splitting queries, or using two different parsers
Date Tue, 30 Jan 2007 09:13:11 GMT
kk, i excuse myself for being so ignorant and not looking through the API  
I found the PerFieldAnalyzerWrapper which i think will do the trick:)

So erhm.. just ignore this message:)

- Aleksander

On Tue, 30 Jan 2007 09:39:07 +0100, Aleksander M. Stensby  
<> wrote:

> Hey everyone! I have a question/problem I hope some of you guys can help  
> me with.
> I have this case where i have put my self in a bit of trouble... The  
> thing is i have several fields indexed, one being "source" and one being  
> "content" (which is the default field), among other fields that are not  
> really that important.
> The thing is, the content field and most of the other fields are parsed  
> and tokenized using a StandardAnalyzer with English stop words. So, this  
> field (and most of the others) are lowercased when indexed, and thus  
> search on these fields should be performed in lower case (or the best  
> thing would of course be to use the same analyzer.)
> The problem occurs when we examine the source field, because here, the  
> field is case-sensitive, and thus search on this field need to be kept  
> case-sensitive aswell, and it should not me tokenized on characters such  
> as "-.," etc.
> Now. How do i solve this?
> I want the user to be able to search using the regular lucene Query  
> Syntax. I.e., an input query like:
> "my Search ExPresssiON goes here" AND source:(Some-Source)
> I tried one thing, and that was to split the input query-string, parse  
> the occurance of "source:", and then use two different parsers on the  
> two parts, then combining them into a new query. But the problem is that  
> this involves many different scenarios, and also parsing failure if i  
> parse it into:
> "my Search ExPressiON goes here" AND
> source:(Some-Source)
> The first will fail since it is an open boolean query missing the last  
> part.
> Also, there is the problem with different writing styles, where  
> sometimes the query can be complex like
> <arg1> AND (source:Some-Source OR source:Some-Source OR ...) AND <arg2> 

> OR <arg3> ...
> etc..
> It really gives me an head-ache. Any ideas how i could solve this  
> problem in the best manner?
> All answers are highly appreciated!
> - Aleksander

Aleksander M. Stensby
Software Developer
Integrasco A/S
Tlf.: +47 41 22 82 72

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message