xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis van Zoerlandt <dvzoerla...@vanboxtel.nl>
Subject Re: AW: AW: AW: AW: OutOfMemoryException while transforming large XML to PDF
Date Fri, 01 Apr 2011 11:13:22 GMT

Hi Andreas,

Alright, it seems a logical explanation you need a large heap to produce
this kind of large documents.

Font auto detection seems to be off. In the FOP configuration file no
auto-detect flag is present and I also didn't include a manifest file with

I will look further into modifying the XSL file in a such way multiple
page-sequences are used. I think it's the best solution this far. Am I
correct to say multiple page-sequences won't affect the definitive page
lay-out of the PDF file? How can I split up the content in multiple
page-sequences? I think there's also a modification necessary in the XML
input file?

Another question: is there a reliable way to 'predict' or calculate the page
count the PDF file will have, before any transformation is started? I can
check the file size of the XML input file, but that isn't really reliable
because the complexity of the XSL stylesheet is also a factor. I'm thinking
of aborting the task when the resulting PDF file will have 100+ pages (for
instance). Is this possible?

Best regards,
Dennis van Zoerlandt

Andreas Delmelle-2 wrote:
> On 31 Mar 2011, at 15:08, Dennis van Zoerlandt wrote:
> Hi Dennis
>> In the meanwhile I have tested a few things. In the attachment you'll
>> find a
>> FO file ( http://old.nabble.com/file/p31286241/fop1.0-5000-fo.zip
>> fop1.0-5000-fo.zip ) which has scrambled data because of confidentiality. 
>> I created the FO file with XMLspy and tried to create a PDF file with
>> Apache
>> FOP 1.0 (fop.bat) on my Windows XP workstation. It produced (what it
>> seems)
>> this error (see below). No PDF file was created.
> It seems like the classic "cram all content into one page-sequence" issue. 
> With a file of that size, there is little or nothing you can do. The
> current architecture of FOP does not allow to render such documents
> without a sufficiently large heap.
> That said: I wrote the above while I was running your sample file (with
> FOP Trunk, using Saxon as XSLT/JAXP implementation), and it just completed
> on my end, with a heap of 1GB. It did take about 7 minutes, but still... I
> got a nice output file of 455 pages.
> I doubt that it is related to images, as there is only one
> fo:external-graphic. 
> Do you have font auto-detection enabled, by any chance? That might consume
> an unnecessary amount of heap space, for example, if you only actually use
> a handful of custom fonts, but have a large number of those installed on
> your system.
> Another option is that some fixes for memory-leaks, applied to Trunk after
> the 1.0 release, are actually helping here.
>> Splitting the XML input file into several chunks is not a preferable
>> option
>> for me, nevertheless it is a valid one.
> Note: it is, strictly speaking, not necessary to split up the input so
> that you have several FOs. What would suffice is to modify the stylesheet,
> so that the content is divided over multiple page-sequences. If you can
> keep the size of the page-sequences down to, say, 30 to 40 pages, that
> might already reduce the overall memory usage significantly.
> There are known cases of people rendering documents of +10.000 pages. No
> problem, iff not all of those pages are generated by the same
> fo:page-sequence.
> Regards
> Andreas
> ---
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

View this message in context: http://old.nabble.com/OutOfMemoryException-while-transforming-large-XML-to-PDF-tp31236044p31293232.html
Sent from the FOP - Users mailing list archive at Nabble.com.

To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

View raw message