lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: dealing with dash chars in fields when using dismax
Date Sun, 13 Jun 2010 15:20:44 GMT
<<<sure .. escaping ends up being the same as removing>>>
I don't think so. Removing would mean that the same exact match search would
match documents with and without hyphens. I.e. searching for "my - way"
would match either
original content of "my way" or "my - way". Whereas escaping the hyphen
would cause only the correct exact match to be returned. This may or may not
be desired behavior...

<<<but still is there some clean solution that doesnt mean a lot of coding
work on my end to handle dash both as a special and as a normal char.>>>

And how would the code know? You're essentially asking for DWIM (Do What I
Mean) functionality, which I've been awaiting for many years....

It seems a reasonable approach would be to have your power users understand
they needed to escape hyphens. Or introduce your own syntax for negation
which would be a simple string substitution on the way through. Or.....
Because somewhere you need some external input that distinguishes between "I
mean this hyphen to be a negation, but this other one to be a literal".

If this seems irrelevant, then I'm missing your point pretty badly. A use
case or two where this distinction is important would be helpful. Or is that
use-case <G>?

Best
Erick

On Sun, Jun 13, 2010 at 11:00 AM, Lukas Kahwe Smith <mls@pooteeweet.org>wrote:

>
> On 13.06.2010, at 16:57, Erick Erickson wrote:
>
> > Have you tried escaping the dashes? Your dismax definition
> > and the output from the analysis admin page would also help.
>
>
> sure .. escaping ends up being the same as removing. but i guess it would
> be the better approach of course. but still is there some clean solution
> that doesnt mean a lot of coding work on my end to handle dash both as a
> special and as a normal char.
>
> something like doing the search twice both with the dash escaped and not
> escaped and then some intelligent scoring to produce the final result set.
>
> regards,
> Lukas Kahwe Smith
> mls@pooteeweet.org
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message