lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8380) UTF8TaxonomyWriterCache inconsistency
Date Wed, 04 Jul 2018 07:16:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532367#comment-16532367
] 

ASF subversion and git services commented on LUCENE-8380:
---------------------------------------------------------

Commit cedeaf976dd9a6c65836b325714496a8d8c1a0cd in lucene-solr's branch refs/heads/branch_7x
from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cedeaf9 ]

LUCENE-8380: UTF8TaxonomyWriterCache page/ offset calculation bug


> UTF8TaxonomyWriterCache inconsistency
> -------------------------------------
>
>                 Key: LUCENE-8380
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8380
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/facet
>    Affects Versions: 7.1
>            Reporter: Ruslan Torobaev
>            Priority: Minor
>             Fix For: 7.5
>
>         Attachments: LUCENE-8380.patch, lucene-taxonomy-cache-report.tar.gz, taxonomy-cache.json.gz,
taxonomy.tar.gz
>
>
> I’m facing a problem with taxonomy writer cache inconsistency. At some point in time
UTF8TaxonomyWriterCache starts to return wrong ord for some facet labels. As result wrong
ord are written in doc facet fields, and wrong counts are returned (undercount) during search.
This bug is manifested on different servers with different index contents (we have several
separate indexes with unique data). 
>  Unfortunately I can’t reproduce this behaviour in tests. 
>  I've dumped "broken" UTF8TaxonomyWriterCache instance and created app to load it and
to compare with real taxonomy. Dumps and app are in attachment. To run demo extract archives
content and exec:
> {code}
> mvn compile
> mvn exec:java -Dexec.mainClass="me.torobaev.lucene.taxonomy.cache.TaxonomyCacheCheck"
-DtaxonomyDir=../taxonomy/ -DcacheDump=../taxonomy-cache.json
> {code}
> As you can see, labels [frametype, 7] and [modification_id, 682] have same ord in cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message