lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-7332) Seed version buckets with max version from index
Date Wed, 22 Apr 2015 21:33:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507943#comment-14507943
] 

Yonik Seeley edited comment on SOLR-7332 at 4/22/15 9:33 PM:
-------------------------------------------------------------

bq. Next, I tried increasing the number of reducers I was using to see how hard I could push
Solr and unfortunately, I ended up with 2 shards that had replicas that were out-of-sync with
their leader. 

Were there any recoveries or change of leaders during the run?
In a way, this is great that you saw this!  Only new adds should significantly narrow what
this could be.  Hopefully you'll be able to reproduce.

bq. can you think of a case where docs could be dropped with this new version bucket seeding
stuff?

No... if we accidentally set the version too high, there are no correctness issues, just extra
checks.
If we accidentally set the version too low, then we can fail to drop repeated or reordered
updates.  But in your test run, this shouldn't be an issue since it's only adds.  Any old
repeats won't change the number of docs (and which docs) are in the index.

edit: additionally, it can't be SOLR-7347 since that requires updates to the same document(s)


was (Author: yseeley@gmail.com):
bq. Next, I tried increasing the number of reducers I was using to see how hard I could push
Solr and unfortunately, I ended up with 2 shards that had replicas that were out-of-sync with
their leader. 

Were there any recoveries or change of leaders during the run?
In a way, this is great that you saw this!  Only new adds should significantly narrow what
this could be.  Hopefully you'll be able to reproduce.

bq. can you think of a case where docs could be dropped with this new version bucket seeding
stuff?

No... if we accidentally set the version too high, there are no correctness issues, just extra
checks.
If we accidentally set the version too low, then we can fail to drop repeated or reordered
updates.  But in your test run, this shouldn't be an issue since it's only adds.  Any old
repeats won't change the number of docs (and which docs) are in the index.


> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch,
SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket
to the MAX value of the {{__version__}} field in the index as early as possible, such as after
the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist
avoid an unnecessary lookup for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message