lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5850) Constants#LUCENE_MAIN_VERSION can have broken values
Date Mon, 28 Jul 2014 21:33:40 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076921#comment-14076921
] 

Uwe Schindler commented on LUCENE-5850:
---------------------------------------

Hi,

thanks for all the work done already. I am on vacation at the moment, so sorry for not responding
in time :-) I was talking to Robert and Simon today via Google Hangouts (my internet was slow
during the day). This issue contains several issues, which should be handled separately.

In fact we have actually 2 problems:
- The documentation of the LUCENE_MAIN_VERSION constant is inconsistent with its use. Originally
this was meant to be the "dot version only", because we decided to never change the index
format in bugfix versions. Because of this the suggested format was "x.y" (and the special
case for alpha/beta introduced by Robert). Unfortunately in some releases, the RM just search/replaced
all version numbers and we suddenly got minor versions there that were written into the index.
This is not really bad at all, although its stupid for file format versions (because it did
not change). I listened to Simon and Robert - both said that we should write the real version
into the index file! We do this already, but only in the commit metadata, not per segment.
We write the full version string there, including the RM's name and so on. So we record this
metadata in the index. But we cannot get the minor version for each segment. So we have to
decide if we write the bugfix version number to index file (x.y.z.a.b.c... as many as you
like). If we do this, we need the validation of LUCENE_MAIN_VERSION (smoker and -> see
below).
- Because of the documentation and missing knowledge, Shai committed a change to DocValues
that uses Version.parseLeniently() to check the index version. Shai already added some catch
blocks to work around issues (he mentioned them in the comment....!!!), instead of using the
officially written version comparator in StringUtils. This was a major bug, because in Version
4.9.1, the shit would be smoking :-) I am glad that Robert fixed it! Many thanks, you are
the best!!!

The documentation of the Version class is perfectly fine, maybe we should add a not, that
this class and parser is only for parsing versions that are user-input from config files.
Also the Version class is only for passing a hint to the API, which behaviour you want to
have. The Version class is not for file formats and should never serialized or used to switch
inside codecs depending on file versions. We can fix two things:
- Add StringUtils to Javadocs ({{@see}} with hint) of Version class
- Maybe add a "hack to Version.parseLeniently to strip everything after the second dot (it's
just a change of regex). By this users can enter 4.9.1, but it will return LUCENE_4_9.

I am happy that Mike added a Smoke tester check, because the test in our code base was not
good enough. But we should fix the version test correctly. Now I refer to Robert's complaint
about the system property: The system property in the test runner was added to prevent the
bug we have. This one passes the {{-Dversion=...}} constant down to tests, so we can validate
LUCENE_MAIN_VERSION. Unfortunately the "version" sysprop also contains bullshit after the
main version, so we just do a "startsWith" check in the test. And this is the bug. If LUCENE_MAIN_VERSION=="4.8"
and we pass {{-Dversion=4.8.1-foobar}} the test passes, because the sysprop starts with the
MAIN_VERSION.

I think the perfect fix can be done since I fixed common-build to have the main version and
the appendix separately. "version" is constrcuted in common-build from a prefix and suffix:

{code:xml}
  <property name="dev.version.base" value="5.0"/>
  <property name="dev.version.suffix" value="SNAPSHOT"/>
  <property name="dev.version" value="${dev.version.base}-${dev.version.suffix}"/>
  <property name="version" value="${dev.version}"/>
{code}

In fact we should change this and pass "dev.version.base" to the test runner and do an "equals"
check in the testcase. By that we guarantee that common-build's version is identical to the
LUCENE_MAIN_VERSION.

> Constants#LUCENE_MAIN_VERSION can have broken values 
> -----------------------------------------------------
>
>                 Key: LUCENE-5850
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5850
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: general/build
>    Affects Versions: 4.3.1, 4.5.1
>            Reporter: Simon Willnauer
>             Fix For: 5.0, 4.10
>
>         Attachments: LUCENE-5850.patch, LUCENE-5850_bomb.patch, LUCENE-5850_smoketester.patch
>
>
> Constants#LUCENE_MAIN_VERSION is set to the Lucene Main version and should not contain
minor versions. Well this is at least what I thought and to my knowledge what the comments
say too. Yet in for instance 4.3.1 and 4.5.1 we broke this such that the version from SegmentsInfo
can not be parsed with Version#parseLeniently. IMO we should really add an assertion that
this constant doesn't throw an error and / or make the smoketester catch this. to me this
is actually a index BWC break. Note that 4.8.1 doesn't have this problem...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message