lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Injecting synonymns into Solr
Date Fri, 01 May 2015 03:08:54 GMT
Thank you for the info. Yup this works. I found out that we can't load
files that are more than 1MB into zookeeper, as it happens to any files
that's larger than 1MB in size, not just the synonyms files.
But I'm not sure if there will be an impact to the system, as the number of
synonym text file can potentially grow up to more than 20 since my sample
synonym file size is more than 20MB.

Currently I only have less than 500,000 records indexed in Solr, so not
sure if there will be a significant impact as compared to one which has
millions of records.
Will try to get more records indexed and will update here again.

Regards,
Edwin


On 1 May 2015 at 08:17, Philippe Soares <soares@genomequest.com> wrote:

> Split your synonyms into multiple files and set the SynonymFilterFactory
> with a coma-separated list of files. e.g. :
> synonyms="syn1.txt,syn2.txt,syn3.txt"
>
> On Thu, Apr 30, 2015 at 8:07 PM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com
> >
> wrote:
>
> > Just to populate it with the general synonym words. I've managed to
> > populate it with some source online, but is there a limit to what it can
> > contains?
> >
> > I can't load the configuration into zookeeper if the synonyms.txt file
> > contains more than 2100 lines.
> >
> > Regards,
> > Edwin
> > On 1 May 2015 05:44, "Chris Hostetter" <hossman_lucene@fucit.org> wrote:
> >
> > >
> > > : There is a possible solution here:
> > > : https://issues.apache.org/jira/browse/LUCENE-2347 (Dump WordNet to
> > SOLR
> > > : Synonym format).
> > >
> > > If you have WordNet synonyms you do't need any special code/tools to
> > > convert them -- the current solr.SynonymFilterFactory supports wordnet
> > > files (just specify format="wordnet")
> > >
> > >
> > > : > > Does anyone knows any faster method of populating the
> synonyms.txt
> > > file
> > > : > > instead of manually typing in the words into the file, which
> there
> > > could
> > > : > be
> > > : > > thousands of synonyms around?
> > >
> > > populate from what?  what is hte source of your data?
> > >
> > > the default solr synonym file format is about as simple as it could
> > > possibly be -- pretty trivial to generate it from scripts -- the hard
> > part
> > > is usually selecting the synonym data you want to use and parsing
> > whatever
> > > format it is already in.
> > >
> > >
> > >
> > > -Hoss
> > > http://www.lucidworks.com/
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message