xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Brown <johnbrown...@hotmail.com>
Subject Cannot convert MS Word HTML to PDF with fop-0.94; fop-0.20.5 worked
Date Fri, 02 Nov 2007 16:49:03 GMT

Hello All,

First of all, I know nothing about Xalan, FOP, etc.  I'm probably breaking
all your posting rules. Maybe the list server figured that out and that's why
it rejected my post via Gmane as spam.  Anyway, I'm just trying
to convert a Microsoft Word HTML file to PDF.

The following article in the JavaWorld Forums:

http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html

explains how to convert HTML to PDF by using JTidy, Xalan and Apache FOP.
The steps are as follows:

1) HTML -> XHTML using jtidy-04aug2000r7-dev
2) XHTML -> XML-FO using xalan-j 2.7.0 and xhtml2fo.xsl donwloaded from:
http://www.antennahouse.com/XSLsample/sample-xsl-xhtml2fo/xhtml2fo.xsl
3) XML-FO -> PDF using Apache fop-0.20.5

I have JRE and JDK 1.6 installed in default locations on Windows XP.

The results are not quite satisfactory:

1) I had to change the left and right margins the stylesheet to 0.0in to
avoid truncation of certain paragraph headings.

2) Certain GIF images were not rendered correctly. These were very small
(1.4K - 3.19K, C++ class hierarchy diagrams, text in boxes), but in
terms of dimensions were similar in size to other images that were
rendered correctly.

3) Chapters don't start on a new page. I assume  that they do in the
original Word document, but I do not know that for a fact.

I wanted to try FOP-0.94, but I get these messages:

C:\progra~1\utils\fop-0.94\fop.bat temp.fo temp.pdf

Nov 2, 2007 9:22:58 AM org.apache.fop.cli.Main startFOP
SEVERE: Exception
javax.xml.transform.TransformerException: org.apache.fop.fo.Validation
Exception:
file:/C:/Documents%20and%20Settings/johnbrown105/MYDOCU~1/TICPP_vol2/
html/temp.fo:4:591:
Error(4/591): fo:block, Error processing foreign attribute:
http://www.w3.org/XML/1998/namespace/@xml:lang
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:168)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:115)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)

---------

javax.xml.transform.TransformerException: org.apache.fop.fo.ValidationException:
file:/C:/Documents%20and%20Settings/johnbrown105/MYDOCU~1/TICPP_vol2/
html/temp.fo:4:591:
Error(4/591): fo:block, Error processing foreign attribute:
http://www.w3.org/XML/1998/namespace/@xml:lang
        at
org.apache.xalan.transformer.TransformerIdentityImpl.transform
(TransformerIdentityImpl.java:501)
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:165)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:115)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)
Caused by: org.apache.fop.fo.ValidationException:
file:/C:/Documents%20and%20Settings/johnbrown105/MYDOCU~1/TICPP_vol2/
html/temp.fo:4:591:
Error(4/591): fo:block, Error processing foreign attribute:
http://www.w3.org/XML/1998/namespace/@xml:lang
        at org.apache.fop.fo.FONode.attributeError(FONode.java:330)
        at org.apache.fop.fo.PropertyList.handleInvalidProperty(PropertyList.java:469)
        at org.apache.fop.fo.PropertyList.addAttributesToList(PropertyList.java:328)
        at org.apache.fop.fo.FObj.processNode(FObj.java:121)
        at
org.apache.fop.fo.FOTreeBuilder$MainFOHandler.startElement(FOTreeBuilder.
java:320)
        at org.apache.fop.fo.FOTreeBuilder.startElement(FOTreeBuilder.java:185)
        at
org.apache.xalan.transformer.TransformerIdentityImpl.startElement
(TransformerIdentityImpl.java:1072)
        at org.apache.xerces.parsers.AbstractSAXParser.startElement
(Unknown Source)
        at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
(Unknown Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
Dispatcher.dispatch(UnknownSource)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at
org.apache.xalan.transformer.TransformerIdentityImpl.transform
(TransformerIdentityImpl.java:484)
        ... 4 more
---------
org.apache.fop.fo.ValidationException:
file:/C:/Documents%20and%20Settings/johnbrown105/MYDOCU~1/TICPP_vol2/
html/temp.fo:4:591:
Error(4/591): fo:block, Error processing foreign attribute:
http://www.w3.org/XML/1998/namespace/@xml:lang
        at org.apache.fop.fo.FONode.attributeError(FONode.java:330)
        at org.apache.fop.fo.PropertyList.handleInvalidProperty(PropertyList.java:469)
        at org.apache.fop.fo.PropertyList.addAttributesToList(PropertyList.java:328)
        at org.apache.fop.fo.FObj.processNode(FObj.java:121)
        at
org.apache.fop.fo.FOTreeBuilder$MainFOHandler.startElement(FOTreeBuilder.
java:320)
        at org.apache.fop.fo.FOTreeBuilder.startElement(FOTreeBuilder.java:185)
        at
org.apache.xalan.transformer.TransformerIdentityImpl.startElement
(TransformerIdentityImpl.java:1072)
        at org.apache.xerces.parsers.AbstractSAXParser.startElement
(Unknown Source)
        at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
(Unknown Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
Dispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at
org.apache.xalan.transformer.TransformerIdentityImpl.transform
(TransformerIdentityImpl.java:484)
        at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:165)
        at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:115)
        at org.apache.fop.cli.Main.startFOP(Main.java:166)
        at org.apache.fop.cli.Main.main(Main.java:197)

Ishould point out that I get these errors even if I use FOP instead of
Xalan-J 2.7.0 to generate temp.fo from temp.xml

I should point out, that while fop-0.20.5 generates the PDF, it does so
after generating 16,315 lines of console output, mostly:

[ERROR] unknown font sans-serif,normal,bolder so defaulted font to any

/* and other fonts in the document, probably hundreds of lines */

Other common errors are:
[ERROR] property - "xml:lang" is not implemented yet.
[ERROR] Error in text-align-last property value 'relative':
org.apache.fop.fo.expr.PropertyException: No conversion defined

Is there an easy fix for this stylesheet?  Does anyone know
of an XHTML -> XML-FO stylesheet whose output will work with fop-0.94?

_________________________________________________________________
Climb to the top of the charts!  Play Star Shuffle:  the word scramble challenge with star
power.
http://club.live.com/star_shuffle.aspx?icid=starshuffle_wlmailtextlink_oct
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Mime
View raw message