lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: merge policy vs commit rates
Date Tue, 01 Aug 2017 12:04:10 GMT
The trade-off does not sound simple to me. This approach could lead to
having more segments overall, making search requests and updates
potentially slower and more I/O-intensive since they have to iterate over
more segments? I'm not saying this is a bad idea, but it could have
unexpected side-effects.

Do you actually have a high commit rate or a high reopen rate
(DirectoryReader.open(IndexWriter))? Maybe reopening instead of committing
(and still committing, but less frequently) would decrease the I/O load
since NRT segments might never need to be actually written to disk if they
are merged before the next commit happens and you give enough memory to the
filesystem cache.

Le mar. 1 août 2017 à 10:59, Tommaso Teofili <tommaso.teofili@gmail.com> a
écrit :

> Hi all,
>
> lately I am looking a bit closer at merge policies, of course particularly
> at the tiered one, and I was wondering if we can mitigate the amount of
> possibly avoidable merges in high commit rates scenarios, especially when a
> high percentage of the commits happens on same docs.
> I've observed several evolutions of merges in such scenarios and it seemed
> to me the merge policy was too aggressive in merging, causing a large IO
> overhead.
> I've then tried the same with a merge policy which was tentatively looking
> at commit rates and skipping merges if such a rate is higher than a
> threshold which seemed to give slightly better results in reducing the
> unneeded IO caused by avoidable merges.
>
> I know this is a bit abstract but I would like to know if anyone has any
> ideas or plans about mitigating the merge overhead in general and / or in
> similar cases.
>
> Regards,
> Tommaso
>
>
>
>

Mime
View raw message