lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aroop Ganguly <aroopgang...@icloud.com>
Subject Re: Solr 7.5 - Indexing Failing due to "IndexWriter is Closed"
Date Wed, 03 Apr 2019 04:23:33 GMT
Thats an interesting scaling scheme you mention.
I have been trying to devise a good scheme for myself for our scale.

I will try to see how this works out for us.

> On Apr 2, 2019, at 9:15 PM, Walter Underwood <wunder@wunderwood.org> wrote:
> 
> Yeah, that would overload it. To get good indexing speed, I configure two clients per
CPU on the indexing machine. With one shard on a 16 processor machine, that would be 32 threads.
With four shards on four 16 processor machines, 128 clients. Basically, one thread is waiting
while the CPU processes a batch and the other is sending the next batch.
> 
> That should get the cluster to about 80% CPU. If the cluster is handling queries at the
same time, I cut that way back, like one client thread for every two CPUs.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Apr 2, 2019, at 8:13 PM, Aroop Ganguly <aroopganguly@icloud.com> wrote:
>> 
>> Mutliple threads to the same index ? And how many concurrent threads?
>> 
>> Our case is not merely multiple threads but actually large scale spark indexer jobs
that index 1B records at a time with a concurrency of 400.
>> In this case multiple such jobs were indexing into the same index. 
>> 
>> 
>>> On Apr 2, 2019, at 7:25 AM, Walter Underwood <wunder@wunderwood.org> wrote:
>>> 
>>> We run multiple threads indexing to Solr all the time and have been doing so
for years.
>>> 
>>> How big are your documents and how big are your batches?
>>> 
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>> On Apr 1, 2019, at 10:51 PM, Aroop Ganguly <aroopganguly@icloud.com>
wrote:
>>>> 
>>>> Turns out the cause was multiple indexing jobs indexing into the index simultaneously,
which one can imagine can cause jvm loads on certain replicas for sure.
>>>> Once this was found and only one job ran at a time, things were back to normal.
>>>> 
>>>> Your comments seem right on no correlation to the stack trace! 
>>>> 
>>>>> On Apr 1, 2019, at 5:32 PM, Shawn Heisey <apache@elyograg.org>
wrote:
>>>>> 
>>>>> 4/1/2019 5:40 PM, Aroop Ganguly wrote:
>>>>>> Thanks Shawn, for the initial response.
>>>>>> Digging into a bit, I was wondering if we’d care to read the inner
most stack.
>>>>>> From the inner most stack it seems to be telling us something about
what trigger it ?
>>>>>> Ofcourse, the system could have been overloaded as well, but is the
exception telling us something or its of no use to consider this stack
>>>>> 
>>>>> The stacktrace on OOME is rarely useful.  The memory allocation where
the error is thrown probably has absolutely no connection to the part of the program where
major amounts of memory are being used.  It could be ANY memory allocation that actually causes
the error.
>>>>> 
>>>>> Thanks,
>>>>> Shawn
>>>> 
>>> 
>> 
> 


Mime
View raw message