lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: Solr 7.5 - Indexing Failing due to "IndexWriter is Closed"
Date Wed, 03 Apr 2019 04:52:23 GMT
If you have fast disk and enough RAM, indexing is CPU limited. So adjust the indexing load
until the CPU is busy but not overloaded.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 2, 2019, at 9:23 PM, Aroop Ganguly <aroopganguly@icloud.com> wrote:
> 
> Thats an interesting scaling scheme you mention.
> I have been trying to devise a good scheme for myself for our scale.
> 
> I will try to see how this works out for us.
> 
>> On Apr 2, 2019, at 9:15 PM, Walter Underwood <wunder@wunderwood.org> wrote:
>> 
>> Yeah, that would overload it. To get good indexing speed, I configure two clients
per CPU on the indexing machine. With one shard on a 16 processor machine, that would be 32
threads. With four shards on four 16 processor machines, 128 clients. Basically, one thread
is waiting while the CPU processes a batch and the other is sending the next batch.
>> 
>> That should get the cluster to about 80% CPU. If the cluster is handling queries
at the same time, I cut that way back, like one client thread for every two CPUs.
>> 
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Apr 2, 2019, at 8:13 PM, Aroop Ganguly <aroopganguly@icloud.com> wrote:
>>> 
>>> Mutliple threads to the same index ? And how many concurrent threads?
>>> 
>>> Our case is not merely multiple threads but actually large scale spark indexer
jobs that index 1B records at a time with a concurrency of 400.
>>> In this case multiple such jobs were indexing into the same index. 
>>> 
>>> 
>>>> On Apr 2, 2019, at 7:25 AM, Walter Underwood <wunder@wunderwood.org>
wrote:
>>>> 
>>>> We run multiple threads indexing to Solr all the time and have been doing
so for years.
>>>> 
>>>> How big are your documents and how big are your batches?
>>>> 
>>>> wunder
>>>> Walter Underwood
>>>> wunder@wunderwood.org
>>>> http://observer.wunderwood.org/  (my blog)
>>>> 
>>>>> On Apr 1, 2019, at 10:51 PM, Aroop Ganguly <aroopganguly@icloud.com>
wrote:
>>>>> 
>>>>> Turns out the cause was multiple indexing jobs indexing into the index
simultaneously, which one can imagine can cause jvm loads on certain replicas for sure.
>>>>> Once this was found and only one job ran at a time, things were back
to normal.
>>>>> 
>>>>> Your comments seem right on no correlation to the stack trace! 
>>>>> 
>>>>>> On Apr 1, 2019, at 5:32 PM, Shawn Heisey <apache@elyograg.org>
wrote:
>>>>>> 
>>>>>> 4/1/2019 5:40 PM, Aroop Ganguly wrote:
>>>>>>> Thanks Shawn, for the initial response.
>>>>>>> Digging into a bit, I was wondering if we’d care to read the
inner most stack.
>>>>>>> From the inner most stack it seems to be telling us something
about what trigger it ?
>>>>>>> Ofcourse, the system could have been overloaded as well, but
is the exception telling us something or its of no use to consider this stack
>>>>>> 
>>>>>> The stacktrace on OOME is rarely useful.  The memory allocation where
the error is thrown probably has absolutely no connection to the part of the program where
major amounts of memory are being used.  It could be ANY memory allocation that actually causes
the error.
>>>>>> 
>>>>>> Thanks,
>>>>>> Shawn
>>>>> 
>>>> 
>>> 
>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message