xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Pepping <spepp...@leverkruid.eu>
Subject Re: FOP and large documents (again)
Date Wed, 03 Aug 2011 08:55:35 GMT
On Wed, Aug 03, 2011 at 10:23:48AM +0200, Stephan Thesing wrote:
> Looking at the code (as far as I understand it), for each page-sequence
> all KnuthElements are computed first by the layout managers.
> This is split only for forced page breaks.
> Then on the whole sequence, possible page break positions are searched for.
> Only after this are the actual output areas computed and pages produced.
> Clearly, this doesn't scale for large page-sequences...
> Is there a reason why this approach was chosen, instead of "lazily" (or on-demand)computing
KnuthElements, putting them on the page and as soon as it is filled, pass it to the renderer?

Both line and page breaking use the Knuth algorithm of a total fit.
The algorithm requires the complete content before it can be applied.
Clearly TeX does not do this; for page breaking it uses a best fit

For FOP it would be better if it could apply either strategy, at the
demand of the user. But FOP is coded such that it first collects all
content, in the process doing all line breaking in paragraphs, before
it starts its page breaking algorithm. Therefore a best fit page
breaking algorithm does not solve the memory problem. Changing this so
that page breaking (best or total fit at the user's choice) is
considered while collecting content has proven too hard (or too
time-consuming) until now. See e.g.

There is a best fit page breaking algorithm, which is mainly used for
cases with varying page widths. But it is a hack in the sense that it
throws away all collected content beyond the current page, and
restarts the process.

So, help needed.


To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

View raw message