lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] [Commented] (SOLR-7332) Seed version buckets with max version from index
Date Sat, 04 Apr 2015 14:36:33 GMT


Yonik Seeley commented on SOLR-7332:

Thanks, I'll try to get to reviewing this soonish.  I also want to think about it in the context
of SOLR-7347

bq.  I've switched to using the versionInfo.blockUpdates
Yep, that's the correct way (to block updates while we're doing something update related).

bq. updates continue to flow in while reload occurs
I have less experience with the core reload code, but why do we need to re-find the highest
version here?

I also want to think a bit about deletes... we actually can't get the highest version from
the index if those versions happened to be deletes.
Consider the following:
1) add doc A, version 5
2) delete doc A, version 10
3) add doc A, version 8

Currently, to get the last version for a document, we look in the tlog (which has deletes).
 If it's not there, we look in the index.  If it's not there, then we check UpdateLog.oldDeletes
(which keeps a list of the last 1000 deletes).  We just need to make sure that the version
seeding/checking does re-open a hole due to deletes.  I think this means just making sure
we get the highest version from all sources (i.e. the tlog as well).   Making sure we never
go backwards in versions is essentially SOLR-7347.

> Seed version buckets with max version from index
> ------------------------------------------------
>                 Key: SOLR-7332
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket
to the MAX value of the {{__version__}} field in the index as early as possible, such as after
the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist
avoid an unnecessary lookup for a non-existent document in the index.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message