poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 54790] Word Document loading strategy is memory hungry and causes OutOfMemoryError
Date Tue, 02 Apr 2013 20:20:39 GMT

--- Comment #1 from Sergey Vladimirov <vlsergey@gmail.com> ---

How much memory does you JVM have? Is it standard (JVM-default) 64/128 Mb
setting, or is it some kind of mobile system?

Somtimes to load the whole file into memory is the only way to process it. For
example, you can't even break text into paragraphs without checking TextPiece
content. And to use TextPiece just as some lightweigh proxy to DocumentStream
going to be very ineffective (due to required character encoding-deconding

Also, disabling preserveTextTable means the whole text is reconstructed into
single buffer (StringBuilder). And in most cases there is no single pointer to
document stream. Is a reconstruction of pretty complex structure using data
from ComplexFileTable. Perhaps is it possible to use "lightweight"
"TextPieceProxy" when "preserveTextTable=true" if we need only to read text.
But from my point of view, it is not a nice way.

You are receiving this mail because:
You are the assignee for the bug.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message