lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Jenkin" <njen...@gmail.com>
Subject Re: Apostrophes in fields
Date Wed, 17 Jan 2007 00:16:07 GMT
Using the fuzzy searching fixed the problem - I will have a play with
the analzyers and see if I can get it working nicely.

Thanks again, much apreciated.

On 1/17/07, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
> : This problem is why some sloppiness is recommended when dealing with
> : WordDelimiterFilter.
>
> particularly when using the generate___Parts="true" options
>
> Nick: if you want simpler matching like this, you might want to consider
> simplifying your definition of "text" ... if you look at the "textTight"
> fieldtype in the example shema (used by the field "sku") you'll see a
> simpler usage of WordDelimiterFilter ... alternately you may just want to
> use lucene's basic StandardAnalzyer ... i believe it strips Apostrophes.
>
> as a real last resort, you could use the recently added
> PatternReplaceFilter to strip out apostrophe's prior to
> WordDelimiterFilter (if you like everything WordDelim does for you except
> spliting on apostrophes)
>
> :   - optionally index ohara at *both* "o" and "hara"
>
> then searching for "Shelley ohara memorial" fails without unless yo have
> slop .. if you need slop, you might as well not index it twice (not to
> mention it throws off the tf/idf calculations)
>
> :   - pick the "alignment" based on the token position in the stream...
> : right-justify the catenations if it's the first token, otherwise
> : left-justify.  One could try to identify proper names and do the
> : justification correctly too (blech).
>
> oh for the love of god please no.
>
>
>
> -Hoss
>
>


-- 
- Nick

Mime
View raw message