lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Solr Trunk Heap Space Issues
Date Tue, 06 Oct 2009 19:30:50 GMT
This is looking like its just a Lucene oddity you get when adding a
single doc due to some changes with the NRT stuff.

Mark Miller wrote:
> Okay - I'm sorry - serves me right for working sick.
>
> Now that I have put on my glasses and correctly tagged my two eclipse tests:
>
> It still appears that trunk likes to use more RAM.
>
> I switched both tests to one million iterations and watched the heap.
>
> The test from the build around may 5th (I promise :) ) regularly GC's
> down to about 70-80MB after a fair time
> of running. It doesn't appear to climb - keeps GC'ing back to 70-80
> (after starting at by GC'ing down to 40 for a bit).
>
> The test from trunk, after a fair time of running, keeps GC'ing down to
> about 120-150MB - 150 at the end, slowly working its
> way up from 90-110 at the beginning.
>
> Don't know what that means yet - but it appears trunk likes to use a bit
> more RAM while indexing. Odd that its so much more because these docs
> are tiny:
>
>     String[] fields = {"text","simple"
>             ,"text","test"
>             ,"text","how now brown cow"
>             ,"text","what's that?"
>             ,"text","radical!"
>             ,"text","what's all this about, anyway?"
>             ,"text","just how fast is this text indexing?"
>     };
>
> Mark Miller wrote:
>   
>> Okay, I juggled the tests in eclipse and flipped the results. So they
>> make sense.
>>
>> Sorry - goose chase on this one.
>>
>> Yonik Seeley wrote:
>>   
>>     
>>> I don't see this with trunk... I just tried TestIndexingPerformance
>>> with 1M docs, and it seemed to work fine.
>>> Memory use stabilized at 40MB.
>>> Most memory use was for indexing (not analysis).
>>> char[] topped out at 4.5MB
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>>
>>> On Tue, Oct 6, 2009 at 12:31 PM, Mark Miller <markrmiller@gmail.com> wrote:
>>>   
>>>     
>>>       
>>>> Yeah - I was wondering about that ... not sure how these guys are
>>>> stacking up ...
>>>>
>>>> Yonik Seeley wrote:
>>>>     
>>>>       
>>>>         
>>>>> TestIndexingPerformance?
>>>>> What the heck... that's not even multi-threaded!
>>>>>
>>>>> -Yonik
>>>>> http://www.lucidimagination.com
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller <markrmiller@gmail.com>
wrote:
>>>>>
>>>>>       
>>>>>         
>>>>>           
>>>>>> Darnit - didn't finish that email. This is after running your old
short
>>>>>> doc perf test for 10,000 iterations. You see the same thing with
1000
>>>>>> iterations but much less pronounced eg gettin' worse with more iterations.
>>>>>>
>>>>>> Mark Miller wrote:
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> A little before and after. The before is around may 5th'is -
the after
>>>>>>> is trunk.
>>>>>>>
>>>>>>> http://myhardshadow.com/memanalysis/before.png
>>>>>>> http://myhardshadow.com/memanalysis/after.png
>>>>>>>
>>>>>>> Mark Miller wrote:
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> Took a peak at the checkout around the time he says he's
using.
>>>>>>>>
>>>>>>>> CharTokenizer appears to be holding onto much large char[]
arrays now
>>>>>>>> than before. Same with snowball.Among - used to be almost
nothing, now
>>>>>>>> its largio.
>>>>>>>>
>>>>>>>> The new TokenStream stuff appears to be clinging. Needs to
find some
>>>>>>>> inner peace.
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




Mime
View raw message