lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Potter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-7332) Seed version buckets with max version from index
Date Wed, 22 Apr 2015 16:51:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507393#comment-14507393
] 

Timothy Potter commented on SOLR-7332:
--------------------------------------

Been running some larger-scale perf tests in Ec2 with this, same basic setup as described
here: https://lucidworks.com/blog/introducing-the-solr-scale-toolkit/

Previously, I indexed 130M docs into a 10x2 (10 shards, rf=2) collection using 10 r3.2xlarge
instances at an avg. rate of 34,881 docs/sec using Solr 4.8.1. With branch5x with the latest
patches for SOLR-7332 and SOLR-7333 applied, the same test resulted in 74,713 docs/sec, which
is better than 2x improvement. The results repeated several times :-)

Next, I tried increasing the number of reducers I was using to see how hard I could push Solr
and unfortunately, I ended up with 2 shards that had replicas that were out-of-sync with their
leader. I'm digging into what may have caused that (proving hard to reproduce now) ... [~yonik@apache.org]
can you think of a case where docs could be dropped with this new version bucket seeding stuff?
My test is all new adds into an empty collection, no deletes, no updates. At first I was thinking
it may be due to the seeding of the highest using the new clock from VersionInfo when the
index is empty.
{code}
+      long maxVersion = Math.max(maxVersionFromIndex, maxVersionFromRecent);
+      if (maxVersion == 0L) {
+        maxVersion = versions.getNewClock();
+        log.warn("Could not find max version in index or recent updates, using new clock
{}", maxVersion);
+      }
{code}

But I can't see how that would cause an issue with this logic in DistributedUpdateProcessor's
versionAdd method (which is the only code I see that drops requests on a replica):

{code}
            if (bucketVersion != 0 && bucketVersion < versionOnUpdate) {
              // we're OK... this update has a version higher than anything we've seen
              // in this bucket so far, so we know that no reordering has yet occurred.
              bucket.updateHighest(versionOnUpdate);
            } else {
              // there have been updates higher than the current update.  we need to check
              // the specific version for this id.
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              if (lastVersion != null && Math.abs(lastVersion) >= versionOnUpdate)
{
                // This update is a repeat, or was reordered.  We need to drop this update.
                log.debug("Dropping add update due to version {}", idBytes.utf8ToString());
                return true;
              }

              // also need to re-apply newer deleteByQuery commands
              checkDeleteByQueries = true;
            }
{code}

Seems to me like if the leader and replica's clocks are out-of-sync, then for a new add, either
the replica's highest is too low so the if block applies or too high and the else block applies,
but since the doc doesn't exist, lastVersion == null. I'll know more once I reproduce it again,
but wanted to let you know the current status of this and see if anything jumped out at you
as to what could cause the replica to be out-of-sync with the leader.

> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch,
SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each version bucket
to the MAX value of the {{__version__}} field in the index as early as possible, such as after
the first soft- or hard- commit. This will ensure that bulk adds where the docs don't exist
avoid an unnecessary lookup for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message