lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Cwikla <>
Subject RE: Multiple languages in same index?
Date Wed, 29 Jan 2003 19:05:04 GMT

Be very careful with the multiple index approach, especially if
you are trying to keep everything on the same machine in the
same process, since 10 languages means a 10x more file handles
opened in lucene...You can easily get up to the tens of thousands
of file handles opened if you have lots of fields.


-----Original Message-----
From: Sale, Doug []
Sent: Wednesday, January 29, 2003 7:40 AM
To: 'Lucene Developers List'
Subject: RE: Multiple languages in same index?


you could use different analyzers over the same index, both indexing and
searching.  however, your search results will be bunk (that's bad).

you would be better off maintaining separate indexes for each language

it might be possible to use 1 index, provided a field was added to each
entry that defined the analyzer used on it.  you would then search first
over the entire index for entries whose analyzer matched the one you are
going to use on the input query (and then do your "regular" search over that
subset).  i.e., it's a pain, better to do it in multiple indexes.


> -----Original Message-----
> From: Randy Darling []
> Sent: Tuesday, January 28, 2003 4:18 PM
> To:
> Subject: Multiple languages in same index?
> Is it ok to index documents that have Chinese, German and English
> in the same index?  From what I can tell I just need to use a 
> different
> analyzer when I create an IndexWriter.  But I do not see a way to
> search with an analyzer for a specific language.
> Or do I need to create a separate index for each language?
> Thanks,
> Randy
> --
> To unsubscribe, e-mail:   
> <>
> For additional commands, e-mail: 
> <>

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message