lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: analysis tool vs. reality
Date Mon, 16 Aug 2010 20:37:24 GMT
On Mon, Aug 16, 2010 at 4:20 PM, Chris Hostetter
<hossman_lucene@fucit.org>wrote:

>
> Even if you convince folks to make every change you think should be made
> to the Lucene QueryParser (again: please take that up in a seperate
> thread) it won't change the fact that people using analysis.jsp should
> understand the distinction between Query Parsing and Analysis -- unless
> you plan on getting rid of every metacharacter that the Lucene QueryParser
> uses to decide what types of Query to build (ie: '"', '-', '"', '*', '?')
> and unless you plan on forcing Solr users to only ever use that one
> QueryParser, then no matter what the Lucene QueryParser does with
> whitespace, users still need to understand the distinction between Query
> Parsing and Analysis so they don't type 'Foo*' into analysis.jsp and then
> ask why it says that will match "food" but it doesn't actually match at
> query time. (suprise suprise: Query Parsing is not the same as analysis,
> and when the QueryParser sees wildcards it doesn't use the analyzer)
>
>
Maybe for once your argument isn't completely bogus: the surprise is
actually key here. Theres really nothing documenting the various
hacks/limitations in the queryparsers: such as auto-tokenizing on
whitespace.

I think the 'expanded terms' not being analyzed is similar, its not really
documented well. Thats probably why it comes up on the mailing list it seems
at least every week [at this point you have to admit, there is a problem].

If you want to say the analysis tool is agnostic about queryparsers, thats
fine, you can keep saying that. I'm saying it shouldn't be.


-- 
Robert Muir
rcmuir@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message