lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Question about word treatment...
Date Sat, 05 May 2007 03:14:09 GMT
Hi,

Didn't see anyone answering your questions...

1) You'll have to write your own analyzer and tokenizer that does the right thing for your
input.  From what you described so far, maybe you can simply use the WhitespaceAnalyzer or
some such.

2) Again, you'd have to write your own analyzer and tokenizer that keeps track of the sliding
window of the last N tokens and looks them up in your synonym table.  When it finds the given
phrase in the lookup table, it returns those last N tokens as a single token.  Something like
that....

Otis

--
Lucene Consulting -- http://lucene-consulting.com/


----- Original Message ----
From: escher2k <escher2k@yahoo.com>
To: solr-user@lucene.apache.org
Sent: Friday, May 4, 2007 4:08:03 PM
Subject: Question about word treatment...


(1) How does one ensure that Solr treats words like .Net and 3D correctly ?
Right now, they get
translated into Net and 3 d respectively.

(2) Is it possible to force Lucene to treat a multiword (e.g. Ruby on Rails)
as one word ? I am not sure
if there is a mechanism to do this by creating a special text file (like the
one that exists for synonyms for
instance) ?

Thanks.
-- 
View this message in context: http://www.nabble.com/Question-about-word-treatment...-tf3693913.html#a10329261
Sent from the Solr - User mailing list archive at Nabble.com.





Mime
View raw message