maven-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Herve Boutemy (JIRA)" <j...@codehaus.org>
Subject [jira] Commented: (MNG-2932) Encoding chaos
Date Tue, 04 Dec 2007 19:25:58 GMT

    [ http://jira.codehaus.org/browse/MNG-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_115842
] 

Herve Boutemy commented on MNG-2932:
------------------------------------

there was a real chaos, with many problems.
I tried the other problem you just reported = german texts in project-info-report
I was able to reproduce the problem, which is in MPIR, not in Maven itself (german umlaut
in pom.xml was a Maven problem)
and I found that it is fixed in svn since August 15 2006, in 2.1-SNAPSHOT version: I checked,
and it works perfectly
there is work in progress to release version 2.1 in the next weeks

if there are other problems, please open a dedicated Jira issue for each one, since the encoding
chaos is now globally fixed: we need now focused reports on precise bugs that could still
be here

> Encoding chaos
> --------------
>
>                 Key: MNG-2932
>                 URL: http://jira.codehaus.org/browse/MNG-2932
>             Project: Maven 2
>          Issue Type: Bug
>          Components: POM::Encoding
>    Affects Versions: 2.0.4, 2.0.5, 2.0.6
>         Environment: windows, linux
>            Reporter: Jörg Hohwiller
>            Assignee: Herve Boutemy
>             Fix For: 2.0.8
>
>
> I have tried maven on a project where javadocs, xdocs, pom-comments are in a native language
with many NON-ASCII characters.
> This seems to reveal that maven is not acting clean with different encodings.
> For instance the xdocs are XML. And XML allows me to use different encodings if properly
declared in the xml header. However it only works if I encode the XML as UTF-8. If I use ISO-8859-1
then the produced HTML contains UTF-8 characters from the nationalized site messages (resource
bundles of maven plugins) and maven dumps the ISO-8859-1 encoded characters into that and
ends up with mixed encodings in one HTML page.
> Additionally the JAVA files also cause trouble when I use a different encoding than UTF-8.
I configured the "encoding" for javadoc plugin to ISO-8859-1 and used Java files in that encoding.
The resulting javadoc HTML was written in ISO-8859-1 but the browser displayed it as UTF-8
and I had to switch explicitly to ISO-8859-1 in firefox in order to have the special characters
displayed properly.
> Further I encounter trouble when I use special characters in pom.xml files that go onto
the generated web-site. In the end I could NOT find a way to have a site without problems
- even when I encode everything as UTF-8.
> Maybe there are too few developers involved from non english-speaking countries that
are used to think beyond US-ASCII ;)
> Unfortunatly I can not tell where the problems come from - it may be XPP, doxia, site-plugin
or individual reports or all together.
> You need to properly distinguish between input and output encoding and have to be extremly
careful with Stuff like byte[]
> and never parse XML from strings.
> Can you reproduce the problem or do you need dummy projects as test-cases?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message