lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Multiple passes with WordDelimiterFilterFactory
Date Fri, 27 Aug 2010 17:00:09 GMT
I agree with Marcus, the usefulness of passing through WDF twice
is suspect. You can always do a copyfield to a completely different
field and do whatever you want there, copyfield forks the raw input
to the second field, not the analyzed stream...

What is it you're really trying to accomplish? Your use-case would
help us help you.

About defining things differently in index and analysis. Sure, it can
make sense. But, especially with WDF it's tricky. Spend some
significant time in the admin analysis page looking at the effects
of various configurations before you decide.

Best
Erick

On Fri, Aug 27, 2010 at 4:26 AM, Markus Jelsma <markus.jelsma@buyways.nl>wrote:

> It's just a configured filter, so you should be able to define it twice.
> Have
> you tried it? But it might be tricky, the output from the first will be the
> input of the second so i doubt the usefulness of this approach.
>
>
> On Thursday 26 August 2010 17:45:45 Shawn Heisey wrote:
> >   Can I pass my data through WordDelimiterFilterFactory more than once?
> > It occurs to me that I might get better results if I can do some of the
> > filters separately and use preserveOriginal on some of them but not
> others.
> >
> > Currently I am using the following definition on both indexing and
> > querying.  Would it make sense to do the two differently?
> >
> > <filter class="solr.WordDelimiterFilterFactory"
> >    splitOnCaseChange="1"
> >    splitOnNumerics="1"
> >    stemEnglishPossessive="1"
> >    generateWordParts="1"
> >    generateNumberParts="1"
> >    catenateWords="1"
> >    catenateNumbers="1"
> >    catenateAll="0"
> >    preserveOriginal="1"
> > />
> >
> > Thanks,
> > Shawn
> >
>
> Markus Jelsma - Technisch Architect - Buyways BV
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message