xml-xindice-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murray Altheim <m.alth...@open.ac.uk>
Subject Re: problems with iso-8859-1 encoding
Date Thu, 15 May 2003 15:15:23 GMT
Simone Pierazzini wrote:
> hi everyone,
> I don't know how to store xml documents like the following:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE stylesheet[
> <!ENTITY agrave "&#224;">
> ]>
> <stylesheet>pipp&agrave;</stylesheet>
> 
> 
> in a collection. I used these commands:
> 
> xindice ad -c /db/test -n a -f a.xml
> xindice rd -c /db/test -n a -f a1.xml
> 
> but a1.xml looks like:
> 
> <?xml version="1.0"?>
> <stylesheet>pipp?</stylesheet>
> 
> that is: &grave; became a simple ?
> 
> even if I write someting like: <stylesheet>pipp&#224;</stylesheet>
> the result is the same

You shouldn't need to use a character entity at all. For example, my
emailer is Netscape 7, and I'm entering an a-grave character here: "à".
If you can't see that character correctly, where is the process
breaking down? It's hard to tell. But Netscape 7 directly supports
the encoding because I can send this message to another Netscape 7
emailer and it will display properly.

If you're actually using UTF-8 encoding, you should be able to directly
input the "à character in your document. I don't know exactly where the
problem is, in the sense of where the entity is being mishandled. Your
DOCTYPE syntax is first of all incorrect, and should be (note the space
after 'stylesheet' is required in XML):

   <!DOCTYPE stylesheet [
   <!ENTITY agrave "&#224;">
   ]>

I'm surprised this didn't throw a parse exception (unless you've
just typed it wrong in your example here).

Something is substituting the "?" character, and I'm not sure it's
Xindice. More likely your own software, your XML processor, or your
display software. (but I don't know for sure -- perhaps someone more
acquainted with Xindice's internals can answer better).

Murray

...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

   Jessica Lynch became an icon of the war, and the story of her capture by
   the Iraqis and her rescue by US special forces will go down as one of the
   most stunning pieces of news management yet conceived. It provides a
   remarkable insight into the real influence of Hollywood producers on the
   Pentagon's media managers, and has produced a template from which America
   hopes to present its future wars.  -- The Guardian, 15 May 2003
   http://www.guardian.co.uk/g2/story/0,3604,956127,00.html


Mime
View raw message