lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: [Q] Faster Atomic Updates - use docValues?
Date Wed, 04 Dec 2019 15:01:10 GMT
This is a huge red flag to me: "(but I could only test for the first few thousand documents”

You’re probably right that that would speed things up, but pretty soon when you’re indexing
your entire corpus there are lots of other considerations.

The indexing rate you’re seeing is abysmal unless these are _huge_ documents, but you
indicate that at the start you’re getting 1,400 docs/second so I don’t think the complexity
of the docs is the issue here.

Do note that when we’re throwing RAM figures out, we need to draw a sharp distinction
between Java heap and total RAM. Some data is held on the heap and some in the OS
RAM due to MMapDirectory, see Uwe’s excellent article:
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe recommends about 25% of your available physical RAM be allocated to Java as
a starting point. Your particular Solr installation may need a larger percent, IDK.

But basically I’d go back to all default settings and change one thing at a time.
First, I’d look at GC performance. Is it taking all your CPU? In which case you probably
need to 
increase your heap. I pick this first because it’s very common that this is a root cause.

Next, I’d put a profiler on it to see exactly where I’m spending time. Otherwise you wind
up making random changes and hoping one of them works.

Best,
Erick

> On Dec 4, 2019, at 3:21 AM, Paras Lehana <paras.lehana@indiamart.com> wrote:
> 
> (but I could only test for the first few
> thousand documents


Mime
View raw message