cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Schlabach <TSchlab...@gmx.net>
Subject Strange content inserted to XML file when served by Cocoon
Date Fri, 25 Apr 2003 15:52:45 GMT
Dear list,

I found something else in Cocoon 2.1 dev that scares me.

I have an XML file which starts like this:

<?xml version="1.0" encoding="iso-8859-1"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>dummy title</title>
</head>

<body>

[snip]


I have a pipeline with nohting but a FileReader and an XML serrializer.

This is what I see in my browser when accessing that file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"<!--================== Imported
Names ====================================--><!-- media type, as per [RFC2045]
--><!-- comma-separated list of media types, as per [RFC2045] --><!-- a
character encoding, as per [RFC2045] --><!-- a space separated list of character
encodings, as per [RFC2045] --><!-- a language code, as per [RFC3066] --><!--
a single character, as per section 2.2 of [XML] --><!-- one or more digits
--><!-- space-separated list of link types --><!-- single or comma-separated
list of media descriptors --><!-- a Uniform Resource Identifier, see [RFC2396]
--><!-- a space separated list of Uniform Resource Identifiers --><!-- date
and time information. ISO date format --><!-- script expression --><!-- style
sheet data --><!-- used for titles etc. --><!-- render in this frame --><!--
nn for pixels or nn% for percentage length --><!-- pixel, percentage, or
relative --><!-- integer representing length in pixels --><!-- these are used
for image maps --><!-- comma separated list of lengths --><!-- used for object,
applet, img, input and iframe --><!-- a color using sRGB: #RRGGBB as Hex
values --><!-- There are also 16 widely known color names with their sRGB
values:

    Black  = #000000    Green  = #008000
    Silver = #C0C0C0    Lime   = #00FF00
    Gray   = #808080    Olive  = #808000
    White  = #FFFFFF    Yellow = #FFFF00
    Maroon = #800000    Navy   = #000080
    Red    = #FF0000    Blue   = #0000FF
    Purple = #800080    Teal   = #008080
    Fuchsia= #FF00FF    Aqua   = #00FFFF
--><!--=================== Generic Attributes
===============================--><!-- core attributes common to most elements
  id       document-wide unique id
  class    space separated list of classes
  style    associated style info
  title    advisory title/amplification
--><!-- internationalization attributes
  lang        language code (backwards compatible)
  xml:lang    language code (as per XML 1.0 spec)
  dir         direction for weak/neutral text
--><!-- attributes for common UI events
  onclick     a pointer button was clicked
  ondblclick  a pointer button was double clicked
  onmousedown a pointer button was pressed down
  onmouseup   a pointer button was released
  onmousemove a pointer was moved onto the element
  onmouseout  a pointer was moved away from the element
  onkeypress  a key was pressed and released
  onkeydown   a key was pressed down
  onkeyup     a key was released
--><!-- attributes for elements that can get the focus
  accesskey   accessibility key character
  tabindex    position in tabbing order
  onfocus     the element got the focus
  onblur      the element lost the focus
--><!-- text alignment for p, div, h1-h6. The default is
     align="left" for ltr headings, "right" for rtl
--><!--=================== Text Elements ====================================--><!--
these can occur
at block or inline level --><!-- these can only occur at block level --><!--
%Inline; covers inline or "text-level" elements --><!--==================
Block level elements ==============================--><!-- %Flow; mixes block
and inline and is used for list items etc. --><!--================== Content
models for exclusions =====================--><!-- a elements use %Inline;
excluding a --><!-- pre uses %Inline excluding img, object, applet, big, small,
     font, or basefont --><!-- form uses %Flow; excluding form --><!--
button uses %Flow; but excludes a, form, form controls, iframe
--><!--================ Document Structure ==================================--><!--
the
namespace URI designates the document profile --><!--================ Document Head
=======================================--><!-- content model is %head.misc;
combined with a single
     title and an optional base element in any order --><!-- The title
element is not considered part of the flow of text.
       It should be displayed, for example as the page header or
       window title. Exactly one title is required per document.
    --><!-- document base URI --><!-- generic metainformation --><!--
  Relationship values can be used in principle:

   a) for document specific toolbars/menus when used
      with the link element in document head e.g.
        start, contents, previous, next, index, end, help
   b) to link to a separate style sheet (rel="stylesheet")
   c) to make a link to a script (rel="script")
   d) by stylesheets to control how collections of
      html nodes are rendered into printed documents
   e) to make a link to a printable version of this document
      e.g. a PostScript or PDF version (rel="alternate" media="print")
--><!-- style info, which may include CDATA sections --><!-- script
statements, which may include CDATA sections --><!-- alternate content container for
non script-based rendering --><!--======================= Frames
=======================================--><!-- inline subwindow --><!-- alternate
content
container for non frame-based rendering --><!--=================== Document
Body ====================================--><!-- generic language/style
container --><!--=================== Paragraphs
=======================================--><!--=================== Headings
=========================================--><!--
  There are six levels of headings from h1 (the most important)
  to h6 (the least important).
--><!--=================== Lists
============================================--><!-- Unordered list bullet styles --><!--
Unordered list --><!-- Ordered
list numbering style

    1   arabic numbers      1, 2, 3, ...
    a   lower alpha         a, b, c, ...
    A   upper alpha         A, B, C, ...
    i   lower roman         i, ii, iii, ...
    I   upper roman         I, II, III, ...

    The style is applied to the sequence number which by default
    is reset to 1 for the first list item in an ordered list.
--><!-- Ordered (numbered) list --><!-- single column list (DEPRECATED)
--><!-- multiple column list (DEPRECATED) --><!-- LIStyle is constrained to:
"(%ULStyle;|%OLStyle;)" --><!-- list item --><!-- definition lists - dt for term,
dd for its definition --><!--=================== Address
==========================================--><!-- information on author
--><!--=================== Horizontal Rule
==================================--><!--=================== Preformatted Text ================================--><!--
content is
%Inline; excluding 
        "img|object|applet|big|small|sub|sup|font|basefont"
--><!--=================== Block-like Quotes
================================--><!--=================== Text alignment ===================================--><!--
center
content --><!--=================== Inserted/Deleted Text
============================--><!--
  ins/del are allowed in block and inline content, but its
  inappropriate to include block content within an ins element
  occurring in inline content.
--><!--================== The Anchor Element
================================--><!-- content is %Inline; except that anchors shouldn't
be nested
--><!--===================== Inline Elements ================================--><!--
generic language/style container --><!-- I18N BiDi over-ride --><!-- forced
line break --><!-- emphasis --><!-- strong emphasis --><!-- definitional
--><!-- program code --><!-- sample --><!-- something user would type --><!--
variable --><!-- citation --><!-- abbreviation --><!-- acronym --><!--
inlined
quote --><!-- subscript --><!-- superscript --><!-- fixed pitch font --><!--
italic font --><!-- bold font --><!-- bigger font --><!-- smaller font --><!--
underline --><!-- strike-through --><!-- strike-through --><!-- base font
size --><!-- local change to font --><!--==================== Object
======================================--><!--
  object is used to embed objects as part of HTML pages.
  param elements should precede other content. Parameters
  can also be expressed as attribute/value pairs on the
  object element itself when brevity is desired.
--><!--
  param is used to supply a named property value.
  In XML it would seem natural to follow RDF and support an
  abbreviated syntax where the param elements are replaced
  by attribute value pairs on the object start tag.
--><!--=================== Java applet
==================================--><!--
  One of code or object attributes must be present.
  Place param elements before other content.
--><!--=================== Images
===========================================--><!--
   To avoid accessibility problems for people who aren't
   able to see the image, you should provide a text
   description using the alt and longdesc attributes.
   In addition, avoid the use of server-side image maps.
--><!-- usemap points to a map element which may be in this document
  or an external document, although the latter is not widely supported
--><!--================== Client-side image maps
============================--><!-- These can be placed in the same document or grouped
in a
     separate document although this isn't yet widely supported
--><!--================ Forms ===============================================--><!--
forms shouldn't be nested --><!--
  Each label must not contain more than ONE field
  Label elements shouldn't be nested.
--><!-- the name attribute is required for all but submit & reset --><!--
form control --><!-- option selector --><!-- option group --><!-- selectable
choice --><!-- multi-line text field --><!--
  The fieldset element is used to group form fields.
  Only one legend element should occur in the content
  and if present should only be preceded by whitespace.
--><!-- fieldset label --><!--
 Content is %Flow; excluding a, form, form controls, iframe
--><!-- push button --><!-- single-line text input control (DEPRECATED)
--><!--======================= Tables
=======================================--><!-- Derived from IETF HTML table standard,
see [RFC1942] --><!--
 The border attribute sets the thickness of the frame around the
 table. The default units are screen pixels.

 The frame attribute specifies which parts of the frame around
 the table should be rendered. The values are not the same as
 CALS to avoid a name clash with the valign attribute.
--><!--
 The rules attribute defines which rules to draw between cells:

 If rules is absent then assume:
     "none" if border is absent or border="0" otherwise "all"
--><!-- horizontal placement of table relative to document --><!--
horizontal alignment attributes for cell contents

  char        alignment char, e.g. char=':'
  charoff     offset for alignment char
--><!-- vertical alignment attributes for cell contents --><!--
colgroup groups a set of col elements. It allows you to group
several semantically related columns together.
--><!--
 col elements define the alignment properties for cells in
 one or more columns.

 The width attribute specifies the width of the columns, e.g.

     width=64        width in screen pixels
     width=0.5*      relative width of 0.5

 The span attribute causes the attributes of one
 col element to apply to more than one column.
--><!--
    Use thead to duplicate headers when breaking table
    across page boundaries, or for static headers when
    tbody sections are rendered in scrolling panel.

    Use tfoot to duplicate footers when breaking table
    across page boundaries, or for static footers when
    tbody sections are rendered in scrolling panel.

    Use multiple tbody sections when rules are needed
    between groups of table rows.
--><!-- Scope is simpler than headers attribute for common tables --><!-- th
is for headers, td for data and for cells acting as both -->>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>dummy title</title>
</head>

<body>
[snip]
</html>

What's that? Is that a feature or did I miss anything in XML basics?

Torsten


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org


Mime
View raw message