lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From balasubramanian sudaakeran <sudaakera...@yahoo.com>
Subject Re: Index and search terms containing character "-"
Date Sun, 31 May 2009 10:57:14 GMT

Hi Tom,
You are using a SimpleAnalyzer while indexing which will do some transformation to your string
before indexing. If you are using analyzer which does transformation to the words during indexing
you should use the same or similar analyzer during querying as well.

Try the same with KeywordAnalyzer instead of SimpleAnalyzer it will work because keyword analyzer
stores as keywords without removing any letters. So it will match.

regards,
Sudaakeran B



----- Original Message ----
From: legrand thomas <thomaslegrand14@yahoo.fr>
To: java-user@lucene.apache.org
Sent: Sunday, May 31, 2009 3:25:20 PM
Subject: Index and search terms containing character "-"

Hi,

I have a problem using TermQuery and FuzzyQuery for terms containing the character "-". Considering
I've indexed "jack" and "jack-bauer" as 2 tokenized captions, I get no result when searching
for "jack-bauer". Moreover, "jack" with a TermQuery returns the two captions.
 
What should I do to get "jack-bauer" with new TermQuery("jack-bauer") ?

A full test case is given below.

Thanks,
Tom


import junit.framework.Assert;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.FSDirectory;
import org.junit.Test;

public class IDebugIndexTest {

    @Test
    public void TermQueryTest() {

        Analyzer analyser = new SimpleAnalyzer();

        try {
            // write docs to new index
            IndexWriter writer = new IndexWriter(FSDirectory
                    .getDirectory("/tmp/idx_test"), analyser, true);

            Document jack = new Document();
            jack.add(new Field("caption", "jack", Field.Store.YES,
                    Field.Index.TOKENIZED));
            writer.addDocument(jack);

            Document jackBauer = new Document();
            jackBauer.add(new Field("caption", "jack-bauer", Field.Store.YES,
                    Field.Index.TOKENIZED));
            writer.addDocument(jackBauer);

            writer.close();

            // try to search
            IndexSearcher s = new IndexSearcher(IndexReader.open(FSDirectory
                    .getDirectory("/tmp/idx_test")));

            // The next assertion is ok
            Hits jackHits = s
            .search(new TermQuery(new Term("caption", "jack")));
            Assert.assertEquals(jackHits.length(), 2);

            // The next assertion fails !!!
            Hits jackBauerHits = s.search(new TermQuery(new Term("caption",
            "jack-bauer")));
            Assert.assertEquals(jackBauerHits.length(), 1);

        } catch (Exception e) {
            Assert.fail();
        }

    }
}


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message