xmlgraphics-batik-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thomas.dewe...@kodak.com
Subject Re: Unicode entity resolved on reading document
Date Tue, 31 Mar 2009 09:47:03 GMT
Hi Paul,

Paul Wellner Bou <paul@purecodes.org> wrote on 03/31/2009 02:57:17 AM:

> thomas.deweese@kodak.com wrote:
> >    I think it's better to explain why this is a problem for you.
> > As long as the text encoding is correct there shouldn't be any
> > problem with replacing the character... So why is there a problem?
> The problem is not technical in this case. It is a question of slightly 
> correcting some data in the SVG and writing it to a new file which 
> should be as similar as possible with the original file. This is 
> required as the people looking into the file to check it will compare it 

> with the original, don't have much knowledge about XML/SVG and will 
> reject it as there are modified lines which don't have to do anything 
> with the correction.

   Then you will either need to educate them or write a tool that will
operate on the raw text stream.  You could potentially write a 
post-processing step that entified any characters that are outside of
7bit Unicode.  It might give almost the same input...

> So it is not possible to use an XML parser without replacing entities?

   No, even if it was Batik would fail on valid input:
        <rect fill="&#x23;&#x46;&#x46;&#x30;&#x30;&#x30;&#x30;"

              x="0" y="0" width="200" height="200"/>

   So it's likely not useful...

View raw message