lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <>
Subject Re: token concat filter?
Date Thu, 01 May 2008 18:20:24 GMT
I doubt it would be that many. I recommend tracking the searches and
the clicks, and working on queries with low clickthrough.

Here are a few of mine from that sort of analysis:

ghost dog => ghost dog, ghostdog
ghost hunters => ghost hunters, ghosthunters
ghost rider => ghost rider, ghostrider
ghost world => ghost world, ghostworld
ghostbusters => ghostbusters, ghost busters

I don't see as many in personal names. Mostly, things like "De Niro"
and "DiCaprio".


On 5/1/08 11:13 AM, "Geoffrey Young" <> wrote:
> Walter Underwood wrote:
>> I've been doing it with synonyms and I have several hundred of them.
> I'm dealing mostly with proper names, so I expect more like 80k of them
> for our data :)
>> Concatenating bi-word groups is pretty useful for English. We have a
>> habit of gluing words together. "database" used to be two words.
>> Dictionaries still think it should be "web server".
> :)
> --Geoff

View raw message