struts-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Kulkarni" <ashish.kulkarn...@gmail.com>
Subject Re: [OT] How to handle non UTF characters in XML
Date Mon, 16 Apr 2007 22:25:17 GMT
Hi
Here is the code where i read the dom tree and then convert it to a String,
then convert this string into Byte array and then user
DocumentBuilder().parse to parse it.

I get error in factory.newDocumentBuilder().parse(byteArray);


 TransformerFactory tFactory =
            TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        StringWriter writer = new StringWriter();
        DOMSource source = new DOMSource(doc);
        transformer.transform(source, new StreamResult(writer));
        String obj = writer.toString();
ByteArrayInputStream byteArray = new ByteArrayInputStream(obj.getBytes());
Document doc = factory.newDocumentBuilder().parse(byteArray);


Ashish
On 4/16/07, Joe Germuska <joe@germuska.com> wrote:
>
> On 4/16/07, Christopher Schultz <chris@christopherschultz.net> wrote:
> >
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Ashish,
> >
> > Ashish Kulkarni wrote:
> > > I have java class which creates an XML file from SQL resultset,
> > > It works fine in USA, but i am having issues when this process runs in
> > > Germany where they have non UTF characters in there database like ü or
> > á.
> >
> > I think you mean non-lower-ASCII. This characters are certainly covered
> > by UTF-8.
> >
> > > How do we handle this kind of situation in XML file, i set the XML
> file
> > to
> > > be of UTF-8 type.
> >
> > How do you set the file "type" to UTF-8?
>
>
> I'm assuming Ashish is talking about the "encoding" attribute of the XML
> declaration in the first line of the file.
>
> Chris is correct that the real magic happens when you serialize the DOM to
> a
> file, but you should be sure to use the same encoding with the writer that
> actually creates the file as you do in the XML declaration.  If your
> characters aren't UTF-8 then don't use UTF-8.  Any decent XML reading
> software will recognize the encoding when the file is read.
>
> Joe
>
> --
> Joe Germuska
> Joe@Germuska.com * http://blog.germuska.com
>
> "The truth is that we learned from João forever to be out of tune."
> -- Caetano Veloso
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message