lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2094) Prepare CharArraySet for Unicode 4.0
Date Tue, 01 Dec 2009 11:53:21 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784189#action_12784189
] 

Simon Willnauer commented on LUCENE-2094:
-----------------------------------------

bq. I think changes other than that should be another issue, a sub-issue, or a linked issue?
As it stands, Robert's patch, having the same name as Simon's, makes it appear that it supersedes
the prior with the same name. It is confusing without the context of reading the thread.
+1  - I created LUCENE-2099 for that purpose and added Roberts latest patch to it.
I will like those two in a second.

bq. By using LUCENE_CURRENT, it means that the most recent behavior should always be used.
That might change in the future. If it does, then it would silently invalidate an index.
This has many reasons. The most important one is that the version will not affect those sets
as the pass false to ingoreCase and we have full control over the stopwords. But I agree this
would be more "secure" if it would use Version.LUCENE-31 just to make sure nobody changes
the internal behavior of CharArraySet. I still would expect anybody changing the behavior
of this class to revise their usage.

> Prepare CharArraySet for Unicode 4.0
> ------------------------------------
>
>                 Key: LUCENE-2094
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2094
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 3.0
>            Reporter: Simon Willnauer
>            Assignee: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.patch,
LUCENE-2094.patch, LUCENE-2094.patch, LUCENE-2094.txt, LUCENE-2094.txt, LUCENE-2094.txt
>
>
> CharArraySet does lowercaseing if created with the correspondent flag. This causes that
 String / char[] with uncode 4 chars which are in the set can not be retrieved in "ignorecase"
mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message