lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-5106) unban properties with unicode escapes
Date Fri, 12 Jul 2013 14:57:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706981#comment-13706981
] 

Uwe Schindler edited comment on LUCENE-5106 at 7/12/13 2:56 PM:
----------------------------------------------------------------

Ok, so what's your plan now?

The idea was to ban *inconsistency* for 4.4. For 4.5 we have enough time to fix all code to
*only* use Reader/Writer with 4.5.

If we apply your patch, one could add a mixed one again (also for 4.4) - so a similar crazy
thing like the one in SOLR-4914: The commit done by [~romseygeek] was the worst thing one
could do, writing with UTF-8 enabled, but reading *only with unicode-escapes allowed*.

So for 4.4, for maximum compatility we use the currently committed one for 4.4 branch only
(only allowing consisten InputStream/OutputStream throughout the code! And in 4.5 we only
allow the UTF-8 one, Reader/Writer throughout the code! This allows to still read files written
by 4.4 and before, with unicode-escapes (because files written by old Lucene/Solr code from
4.4 and earlier) are still correctly decoded (The Reader {{load(Reader)}} method decodes unicode-escaped,
too). In fact, files written by the InputStream API are US-ASCII only (see src.zip).

For forbidden-apis (the original forbidden-apis), I plan to allow both by default.
                
      was (Author: thetaphi):
    Ok, so what's your plan now?

The idea was to ban *inconsistency* for 4.4. For 4.5 we have enough time to fix all code to
*only* use Reader/Writer with 4.5.

If we apply your patch, one could add a mixed one again (also for 4.4) - so a similar crazy
thing like the one in SOLR-4914: The commit done by [~romseygeek] was the worst thing one
could do, writing with UTF-8 enabled, but reading *only with unicode-escapes allowed*.

So for 4.4, for maximum compatility we use the currently committed one for 4.4 branch only
(only allowing consisten InputStream/OutputStream throughout the code! And in 4.5 we only
allow the UTF-8 one, Reader/Writer throughout the code! This allows to still read files written
by 4.4 and before, with unicode-escapes (because files written by old Lucene/Solr code from
4.4 and earlier) are still correctly decoded (The Reader {{load(Reader)}} method decodes unicode-escaped,
too).

For forbidden-apis (the original forbidden-apis), I plan to allow both by default.
                  
> unban properties with unicode escapes
> -------------------------------------
>
>                 Key: LUCENE-5106
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5106
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.4
>            Reporter: Robert Muir
>            Priority: Blocker
>         Attachments: LUCENE-5106.patch
>
>
> As discussed on the mailing list, its just wrong to ban the use of unicode here.
> This blocks 4.4 (because it was committed there, too)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message