lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (SOLR-8709) Add checksum to the TopicStream to ensure delivery of all documents within a Topic
Date Tue, 05 Apr 2016 16:34:25 GMT

     [ https://issues.apache.org/jira/browse/SOLR-8709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joel Bernstein updated SOLR-8709:
---------------------------------
    Comment: was deleted

(was: A little more detail on the design. A new "retentionWindow" parameter will be added
to the TopicStream to define the window of time for holding processed version numbers. This
retentionWindow should be larger then the soft commit window. A TreeMap will be used to hold
all the version numbers that have been processed. This TreeMap will be trimmed to hold only
version numbers within the retention window. A PostFilter will be added that checks to see
if a document is within the retentionWindow but is not in the TreeMap. This should catch any
out of order version numbers. The contents of the TreeMap will be persisted as part of the
checkpoint. )

> Add checksum to the TopicStream to ensure delivery of all documents within a Topic
> ----------------------------------------------------------------------------------
>
>                 Key: SOLR-8709
>                 URL: https://issues.apache.org/jira/browse/SOLR-8709
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>
> Currently the TopicStream can miss documents if version numbers are received out-of-order.
The TopicStream sorts on version number so it will only miss out-of-order versions that span
commit boundaries. *Stress testing was not able create a missed document scenario*, but code
review points to the possibility of this happening.
> In order to resolve this issue we can adopt an approach that keeps a checksum of the
version numbers for a specific time window. This checksum can be checked each run and if the
checksums don't match the documents from the time window can be resent. As long as the time
window is longer then the softCommit interval, this will guarantee delivery of all documents
for the Topic. This won't guarantee *one time delivery* but should be provide a reasonable
expectation of one time delivery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message