lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Potter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-7332) Seed version buckets with max version from index
Date Wed, 29 Apr 2015 14:43:06 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519457#comment-14519457
] 

Timothy Potter commented on SOLR-7332:
--------------------------------------

Haven't been able to reproduce this with many stress tests on EC2 and it's starting to get
expensive ;-)

bq. Were there any recoveries or change of leaders during the run?

There definitely could have been some recoveries but I'm not sure. I'm taking a snapshot of
cluster state before I run my tests to compare to after in case I do reproduce this. Yesterday
I pushed it very hard with 48 reducers from Hadoop, which led to some network issue between
leader and replica and the leader put the replica into recovery, see SOLR-7483. However, the
replica eventually recovered and was in-sync with the leader at the end, which is goodness.

bq. No... 

Thanks for confirming. I was thinking that maybe it had something to do with this patch resetting
the max after replaying the tlog:

>From UpdateLog:
{code}
@@ -1247,6 +1269,12 @@
         // change the state while updates are still blocked to prevent races
         state = State.ACTIVE;
         if (finishing) {
+
+          // after replay, update the max from the index
+          log.info("Re-computing max version from index after log re-play.");
+          maxVersionFromIndex = null;
+          getMaxVersionFromIndex();
+
           versionInfo.unblockUpdates();
         }
{code}

But since updates are blocked while this happens, it seems like the right thing to do.

I'm going to run this a few more times using same setup as when it occurred the first time
and then I think we should commit this to trunk and see how it behaves for a few days, as
the performance improvement is a big win.

> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch,
SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket
to the MAX value of the {{__version__}} field in the index as early as possible, such as after
the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist
avoid an unnecessary lookup for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message