lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Maximum number of fields allowed in a Solr document
Date Tue, 01 Dec 2009 21:47:45 GMT
Lucene creates an array of one item per document for every field you
sort on. If you sort on a thousand fields, Lucene will create 1000
different arrays of 500K ints. I assume there is some sort of cache of
these arrays. In Solr, it is also possible to sort using a function as
the relevance value. This is rather slow, and caches no data between
queries.

You may want to do sorting in your front-end applications, or get
database ids from Solr and do sorting in the database query.

On Mon, Nov 30, 2009 at 7:14 AM, Alex Wang <awang@crossview.com> wrote:
> Thanks Otis for the reply. Yes this will be pretty memory intensive.
> The size of the index is 5 cores with a maximum of 500K documents each
> core. I did search the archives before but did not find any definite
> answer. Thanks again!
>
> Alex
>
>
>
> On Nov 27, 2009, at 11:09 PM, Otis Gospodnetic wrote:
>
>> Hi Alex,
>>
>> There is no build-in limit.  The limit is going to be dictated by
>> your hardware resources.  In particular, this sounds like a memory
>> intensive app because of sorting on lots of different fields.  You
>> didn't mention the size of your index, but that's a factor, too.
>> Once in a while people on the list mention cases with lots and lots
>> of fields, so I'd check ML archives.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>> From: Alex Wang <awang@crossview.com>
>>> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
>>> Sent: Thu, November 26, 2009 12:47:36 PM
>>> Subject: Maximum number of fields allowed in a Solr document
>>>
>>> Hi,
>>>
>>> We are in the process of designing a Solr app where we might have
>>> millions of documents and within each of the document, we might have
>>> thousands of dynamic fields. These fields are small and only contain
>>> an integer, which needs to be retrievable and sortable.
>>>
>>> My questions is:
>>>
>>> 1. Is there a limit on the number of fields allowed per document?
>>> 2. What is the performance impact for such design?
>>> 3. Has anyone done this before and is it a wise thing to do?
>>>
>>> Thanks,
>>>
>>> Alex
>>
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message