poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yegor Kozlov <ye...@dinom.ru>
Subject a 'lite' version of ooxml-schemas jar
Date Mon, 16 Nov 2009 16:53:48 GMT
Hi All,

As we discussed at Apachecon, one way to optimize the size of POI distributions is to create
a 'lite' version of the 
ooxml-schemas jar.
The idea is simple: remove all unused classes and resources from the jar generated by XMLBeans.
Rough estimations made 
at the Barcamp showed that POI uses less than 30% of the OOXML schemas, hence the optimized
jar should be significantly 

With this in mind I created a simple utility called OOXMLLite, see 

The process includes four simple steps:

  - run all ooxml unit tests
  - see what classes from the ooxml-schemas.jar are loaded in the JVM
  - copy the loaded classes into some directory.
  - copy the binary resources (.xsb)

  A good acceptance test is to run the ooxml unit tests against the 'lite' classes - all should
pass. There is an 
accompanying Ant task ooxml-xsds-lite for that, see build.xml.

The resulting 'lite' jar is much smaller: ooxml-schemas-lite-3.6-beta1.jar is only 3.5 MB
while the 'big' 
ooxml-schemas-1.0.jar is 14.5 MB. In theory, the size can be trimmed down below 3 MB  - my
utility copies all .xsb files 
and does not yet track resource dependencies.

I propose to include ooxml-schemas-lite in the release cycle. The artifact name is ooxml-schemas-lite-${version.id}.jar.
Interested projects (first of all I mean Apache Tika) can setup their Maven poms to use 
<artifactId>poi-ooxml-lite</artifactId>  instead of <artifactId>poi-ooxml</artifactId>.
This will reduce the 
distribution size by approximately 10 MB.


To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message