xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Puppala, Kumar (LNG-CON)" <kumar.pupp...@lexisnexis.com>
Subject RE: Fop 0.20.5 vs Fop Trunk Performace
Date Fri, 15 Feb 2008 19:37:28 GMT

   Provided below are my responses.


Thanks !!


-----Original Message-----
From: Andreas Delmelle [mailto:andreas.delmelle@telenet.be] 
Sent: Wednesday, February 13, 2008 3:16 PM
To: fop-users@xmlgraphics.apache.org
Subject: Re: Fop 0.20.5 vs Fop Trunk Performace


On Feb 12, 2008, at 21:10, Puppala, Kumar (LNG-CON) wrote:


Hi Kumar


> >

> > Just to be sure: which revision of the trunk are you trying out?


> I obtained the latest from the trunk on Jan 22nd. Information  

> pertaining to the fopTrunk as seen in the status.xml file is:

> "status.xml 614201 2008-01-22 14:02:27Z jeremias"


OK, I think you can safely update this to the latest. Not that it  

will matter much.



> <snip />

> Yes, I am instantiating FopFactory just once. Initially I was not  

> doing that but I changed my code to instantiate it just once. The  

> results provided are with the change.



> > Apart from that, focusing purely on FOP Trunk, if you know how to

> > narrow it down to specific methods/calls that cause the increase in

> > processing time that would help us a lot.


> I do have the complete Heap report. Some of the classes having  

> maximum instances are as shown below:


Well, it's not so much the number of objects I'm thinking of, but  

rather, how much time is spent executing specific methods and which  

ones take longer in later iterations. The actual cause of the  

slowdown may precisely be located in a class of which there are  

relatively few instances alive, if I judge correctly. Or did you  

already check whether the bulk of the increase in processing-time is  

really only spent on garbage-collection?


--> I did verify that garbage collection is playing a very significant
role in these increased timings. I am currently using the Java Profiling
tool to identify the methods where we have the maximum execution time. I
ran the basic simple.fo file provided in the examples directory and
below is the screen shot from the profiling tool:




I need to do further analysis on this but comparing this output to the
one generated by running the same fo file against fop 0.20.5 codebase, I
do see two significant areas (lineBreak and layoutMgr) that are taking
time. The profiling output for the same test on fop 0.20.5 codebase is
as shown below:




> 463132 instances of class org.apache.fop.traits.MinOptMax

> 441537 instances of class org.apache.fop.layoutmgr.NonLeafPosition

<snip />


> I am not sure if this is something expected.


Is this an overall total, or a snapshot taken at a given point? These  

are figures I'd expect for a rather large page-sequence...

--> This is the overall total taken at the end of all the iterations.


> Small question: Did you, by any chance, also try different JVM

> versions? Different platform?


> No. I can try on jre1.6.0_04. Since we are running the current FOP  

> on Solaris platform, we are performing our tests on Solaris.


> > Do you know which XML parser / XSLT processor gets used at runtime?


> We do not use an XSLT processor. We generate the FO file using an  

> in-house application and feed it to the FOP Server. Since I am  

> using the default handler, I think it's using SAX Parser behind the  

> scenes.


Right, now I remember you already mentioned this earlier.


> <snip />

> In local tests I ran here, with two concurrent threads and a shared

> FopFactory instance, the processing time remains quite stable for me

> (test run on Apple JVM 1.5 using a  document that generates two page-

> sequences (=2+69 pages; the larger page-sequence contains forced

> breaks for each page))`


> My tests are much more diverse. Each iteration contains about 120  

> testcases. Each testcase targets a specific feature that we use.  

> Hence each such iteration covers most of the features like tables,  

> cells, images, big documents, rowspanning, columnSpanning, dual  

> column layout etc... In total I would say I am generating about 3000  

> pages per iteration. When comparing the results, I am comparing  

> them after each such iteration for about 15 times and I am seeing a  

> gradual increase in processing times.


Interesting. Can you somehow dump the testcases as a set of physical  

FO files, and make that available somewhere? This would make it  

possible for us to run the same tests locally, and investigate further.

--> The fo files have been zipped and places on a public share:



If this is impossible for you, then I'd advise to start with a  

drastically trimmed-down version of your test-suite, and gradually  

change and/or increase the number of tests. See if you can isolate  

the problem to a specific set of files (tables? markers? custom  

fonts? etc.) At least that will give us a clue on where to start  

looking. What may also prove valuable is to try the tests using a  

different renderer.


One more remark:

> > 2)       I do see a lot of garbage collection happening in the new

> > FOP. The collection times are also very high.


As I already hinted at, this is not bad per se. This could simply  

indicate that FOP Trunk offers the GC more opportunities to clean up,  

so as to reduce the average footprint (when looking at it like a  

series of snapshots). Memory-consumption vs. processing-speed is  

virtually always a trade-off: the less info is cached, the more  

computations need to be performed multiple times, but a calculator  

that caches /all/ results and /never/ makes the same computation  

twice, requires an insane amount of memory...


That said, it still remains strange that the processing time  

increases with the number of runs... Can you try leaving the  

iterations running into the hundreds or thousands? Does the time keep  

increasing? By the same amount?

--> I will run some tests next week to verify this.


> <snip />

> Come to think of it: are your images stored on a local disk, or is

> there any network traffic involved that might explain the increasing

> lag...?


> The images are stored on local disk. However, I do see better  

> results for testcases containing Images and hence I do not believe  

> that there are any network traffic issues involved.


Sorry, I did not mean 'images' but more generally 'documents'. Are  

the input/output files all located on the same machine, or does some  

of it come from/end up on different machines? If so, are these  

machines dedicated to serving the I/O requests for your FOP process,  

or are they used for other processes as well?

--> The input/output files are all located on the same machine.






To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org

For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


View raw message