lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaeger, Jay - DOT" <Jay.Jae...@dot.wi.gov>
Subject RE: Synonyms Not Working when using SRC & DEST
Date Wed, 07 Sep 2011 12:48:53 GMT
> I have a very huge schema spanning up to 10K lines , if I use query time it
> will be huge hit for me because one term will be mapped to multiple terms .
> similar in the case of allergy

I think maybe you mean synonym file, rather than the schema?  I doubt that the number of lines
matters all that much, though undoubtedly some.  I expect that Solr loads that synonym file
into some kind of hash map, rather than searching it linearly -- though I have not looked
at the code for that.

> I replace allergy during the index with doctors , So it shouldn't be part of
> the document ?

Yes indeed, doctors would be in the index, and would give you a hit on that document when
searched.  But because your synonym file specifies replacement, that means that allergy is
*NOT* part of the index, hence, when you searched on allergy, you got no results.

As far as synonym expansion being a "huge hit", no, not really, I think.  Besides, if you
are not getting what you want or need, speed becomes pretty much irrelevant.  We did some
performance testing:  modest single server (i.e., a laptop running Windows XP with only 2GB
total memory available), pretty much configured "out of the box" with jetty, except that we
added waffle authentication.  The data was names, addresses and the like (not text) -- 7+
million rows, with considerable synonym expansion:  200 first name synonyms, 433 last name
synonyms, expanded at both index time and search time.

We then did a search test driven from those same synonyms files, by randomly picking out a
name from the first and last name list, the idea being that most likely names did have some
synonyms.

Under Solr 3.1, once the OS file system cache got some entries in there, running with 8 concurrent
client search threads sending HTTP search requests (done in perl) we averaged about .50 seconds
per request, or over 55,000 searches per hour.

JRJ

-----Original Message-----
From: balaji [mailto:mcabalaji@gmail.com] 
Sent: Tuesday, September 06, 2011 7:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms Not Working when using SRC & DEST

> It won't work given your current schema.  To get the desired results, you
> would need to expand your synonyms at both index AND query time.  Right now
> your schema seems to specify it only at index time.
>

I have a very huge schema spanning up to 10K lines , if I use query time it
will be huge hit for me because one term will be mapped to multiple terms .
similar in the case of allergy

I doesn't want to go with comma separated as it will give
some erroneous results  and more over allergy and doctors are not equivalent
terms to be used in comma


>
> So, as the other respondent indicated, currently you replace allergy with
> the other list when indexing, and since allergy is not replaced during
> query, it gets no hits.
>

I replace allergy during the index with doctors , So it shouldn't be part of
the document ?


Thanks
Balaji


--
View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Not-Working-when-using-SRC-DEST-tp3313862p3315287.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message