cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Huttar" <lars_hut...@sil.org>
Subject memory issues, SAX
Date Tue, 03 Aug 2004 03:25:12 GMT
Dear Cocoon gurus,              [Cocoon 2.1.2, Tomcat 4.1]

We have an application where we need to generate an index
from a large database. We seem to be running out of memory
even in getting the unprocessed out of the database.

We initially did (sitemap pseudocode)
  - xsp query to get rows from "Index" table of database
  - XSLT transformation that groups together rows with certain identical fields
  - XSLT transformation that wraps "source:write" markup around
    the XML
  - the write-source transformer to put the XML into a file
  - (serialize as XML)

This worked for small rowsets, but when we jump from 3700 to
9500 rows, it fails, with the message
org.apache.cocoon.ProcessingException: Exception in ServerPagesGenerator.generate():
java.lang.RuntimeException: org.apache.cocoon.ProcessingException: insertFragment: fragment
is
required.

which sounds like write-source transformer is complaining that it didn't
get its "fragment" (data to write to the file), so I supposed
there was a failure before the write-source transformer.
I wondered if the XSLT transformations were each building
a DOM for the entire input. This would account for running out
of memory.

So I tried reducing the pipeline to just obtaining the data
and writing it to a file without grouping.
First I tried

  - xsp query to get rows from "Index" table of database
  - XSLT transformation that just wraps "source:write" markup around
    the XML
  - the write-source transformer to put the XML into a file

but this failed too, and of course it has an XSLT transformation
which is suspect -- is it building a DOM? So next I tried

  - file generator to get a file that contained a source:write
    wrapper around a cinclude statement
  - cinclude transformer to get the data
  - the write-source transformer to put the XML into a file

And in a separate pipeline called by the cinclude statement,

  - xsp query to get rows from "Index" table of database

But this still failed!

So now I'm wondering how it's possible to process big sets
of data at all in Cocoon. We thought SAX meant that the XML
data was sent piece-by-piece down the pipeline, serially,
so you didn't run out of memory when you had a big XML data
file. Does using XSLT mess that up by building DOMs?
What about cinclude?
What *can* you use to get lots of data from a database
and process it without having to have it all in memory
at once? Does this task need to be done outside of Cocoon?

Of course, we can split the operation up into little pieces;
but we don't want to go through that hassle if it's avoidable.

Is it possible that I'm missing the point completely and
there's something other than memory that's causing the
operation to fail?
By the way this machine has 384MB, and another I was testing
on had 512MB. They both failed at about the same point.

Thanks for any explanations or suggestions...
Lars


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message