lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Fiandesio <matteo.fiande...@gmail.com>
Subject Re: OOM on sorting on dynamic fields
Date Tue, 22 Jun 2010 08:21:07 GMT
First of all thanks for your answers.
Those OOMEs are pretty nasty for our production environment.
I didn't try the solution of ordering by function as it was a solr 1.5
feature and we prefer to use a stable version 1.4.

I made a temporary patch that it looks is working fine.
I patched the lucene-core-2.9.1 source code adding those this lines in the


abstract static class Cache's get method
...
public Object get(IndexReader reader, Entry key) throws IOException {
      Map innerCache;
      Object value;
+      final Object readerKey = reader.getFieldCacheKey();
+     CacheEntry[] cacheEntries = wrapper.getCacheEntries();
+    if(cacheEntries.length>A_TUNED_INT_VALUE){
+    	  readerCache.clear();
+     }
...

I didn't notice any delay or concurrence problem.




On 22 June 2010 07:27, Lance Norskog <goksron@gmail.com> wrote:
> No, this is basic to how Lucene works. You will need larger EC2 instances.
>
> On Mon, Jun 21, 2010 at 2:08 AM, Matteo Fiandesio
> <matteo.fiandesio@gmail.com> wrote:
>> Compiling solr with lucene 2.9.3 instead of 2.9.1 will solve this issue?
>> Regards,
>> Matteo
>>
>> On 19 June 2010 02:28, Lance Norskog <goksron@gmail.com> wrote:
>>> The Lucene implementation of sorting creates an array of four-byte
>>> ints for every document in the index, and another array of the unique
>>> values in the field.
>>> If the timestamps are 'date' or 'tdate' in the schema, they do not
>>> need the second array.
>>>
>>> You can also sort by a field's with a function query. This does not
>>> build the arrays, but might be a little slower.
>>> Yes, the sort arrays (and also facet values for a field) should be
>>> controlled by a fixed-size cache, but they are not.
>>>
>>> On Fri, Jun 18, 2010 at 7:52 AM, Matteo Fiandesio
>>> <matteo.fiandesio@gmail.com> wrote:
>>>> Hello,
>>>> we are experiencing OOM exceptions in our single core solr instance
>>>> (on a (huge) amazon EC2 machine).
>>>> We investigated a lot in the mailing list and through jmap/jhat dump
>>>> analyzing and the problem resides in the lucene FieldCache that fills
>>>> the heap and blows up the server.
>>>>
>>>> Our index is quite small but we have a lot of sort queries  on fields
>>>> that are dynamic,of type long representing timestamps and are not
>>>> present in all the documents.
>>>> Those queries apply sorting on 12-15 of those fields.
>>>>
>>>> We are using solr 1.4 in production and the dump shows a lot of
>>>> Integer/Character and Byte Array filled up with 0s.
>>>> With solr's trunk code things does not change.
>>>>
>>>> In the mailing list we saw a lot of messages related to this issues:
>>>> we tried truncating the dates to day precision,using missingSortLast =
>>>> true,changing the field type from slong to long,setting autowarming to
>>>> different values,disabling and enabling caches with different values
>>>> but we did not manage to solve the problem.
>>>>
>>>> We were thinking to implement an LRUFieldCache field type to manage
>>>> the FieldCache as an LRU and preventing but, before starting a new
>>>> development, we want to be sure that we are not doing anything wrong
>>>> in the solr configuration or in the index generation.
>>>>
>>>> Any help would be appreciated.
>>>> Regards,
>>>> Matteo
>>>>
>>>
>>>
>>>
>>> --
>>> Lance Norskog
>>> goksron@gmail.com
>>>
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
View raw message