poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Fisher <dfis...@jmlafferty.com>
Subject Re: a 'lite' version of ooxml-schemas jar
Date Mon, 16 Nov 2009 17:26:07 GMT
Hi Yegor,

+1

This will have affects on the website re-write.

(1) The "How to Build" page has a list of common targets. Here is what  
I have currently:

clean -- Erase all build work products (ie. everything in the build  
directory
compile	-- Compiles all files from main, contrib and scratchpad
test -- Run all unit tests from main, contrib and scratchpad (JUnit)
jar -- Produce jar files
docs -- Generate all documentation for the system (Apache Forrest)
dist -- Create a distribution (JUnit and Apache Forrest)

This should always be part of the dist target. Should we add a target  
for building a "lite" ooxml, or is this always be part of jar and test?

I think we should have a "lite" target separate from jar and test.

(2) I am reworking the home page. There is a table of components that  
appear there.

Document -- Component -- JAR -- Maven artifactId
OLE2 Filesystem -- POIFS -- poi-version-yyyymmdd.jar -- poi
OLE2 Property Sets -- HPSF -- poi-version-yyyymmdd.jar -- poi
Excel XLS -- HSSF -- poi-version-yyyymmdd.jar -- poi
Excel XLSX -- XSSF -- poi-ooxml-version-yyyymmdd.jar -- poi-ooxml
PowerPoint PPT -- HSLF -- poi-scratchpad-version-yyyymmdd.jar -- poi- 
scratchpad
PowerPoint PPTX -- XSLF -- poi-ooxml-version-yyyymmdd.jar -- poi-ooxml
Word DOC -- HWPF -- poi-scratchpad-version-yyyymmdd.jar -- poi- 
scratchpad
Word DOCX -- XWPF -- poi-ooxml-version-yyyymmdd.jar -- poi-ooxml
Visio VSD -- HDGF -- poi-scratchpad-version-yyyymmdd.jar -- poi- 
scratchpad
Publisher PUB -- HPBF -- poi-scratchpad-version-yyyymmdd.jar -- poi- 
scratchpad
Outlook MSG -- HSMF -- poi-scratchpad-version-yyyymmdd.jar -- poi- 
scratchpad

I am missing the OOXML schemas in my list. With this new lite version  
I need two rows.

OOXML Schemas -- OpenXML4J -- ooxml-schemas-yyyymmdd.jar -- poi-ooxml
OOXML Lite -- OpenXML4J -- ooxml-schemas-lite-yyyymmdd.jar -- poi- 
ooxml-lite

We will need to include poi-ooxml-version-yyyymmdd.jar in the poi- 
ooxml-lite target as well. I'll mark the XLSX, XWPF, and XSLF rows  
appropriately.

Correct?

(3) I 'll rewrite your description as a new page within the currently  
very sparse. OOXML documentation.

BTW - the www.openxml4j.org domain has gone away and I am going to  
need help from you in deciding additional documentation and OPC  
examples that we should include for the OOXML sub-project.

Regards,
Dave

On Nov 16, 2009, at 8:53 AM, Yegor Kozlov wrote:

> Hi All,
>
> As we discussed at Apachecon, one way to optimize the size of POI  
> distributions is to create a 'lite' version of the ooxml-schemas jar.
> The idea is simple: remove all unused classes and resources from the  
> jar generated by XMLBeans. Rough estimations made at the Barcamp  
> showed that POI uses less than 30% of the OOXML schemas, hence the  
> optimized jar should be significantly smaller.
>
> With this in mind I created a simple utility called OOXMLLite, see http://svn.apache.org/repos/asf/poi/trunk/src/ooxml/java/org/apache/poi/util/OOXMLLite.java
>
> The process includes four simple steps:
>
> - run all ooxml unit tests
> - see what classes from the ooxml-schemas.jar are loaded in the JVM
> - copy the loaded classes into some directory.
> - copy the binary resources (.xsb)
>
> A good acceptance test is to run the ooxml unit tests against the  
> 'lite' classes - all should pass. There is an accompanying Ant task  
> ooxml-xsds-lite for that, see build.xml.
>
> The resulting 'lite' jar is much smaller: ooxml-schemas-lite-3.6- 
> beta1.jar is only 3.5 MB while the 'big' ooxml-schemas-1.0.jar is  
> 14.5 MB. In theory, the size can be trimmed down below 3 MB  - my  
> utility copies all .xsb files and does not yet track resource  
> dependencies.
>
> I propose to include ooxml-schemas-lite in the release cycle. The  
> artifact name is ooxml-schemas-lite-${version.id}.jar.
> Interested projects (first of all I mean Apache Tika) can setup  
> their Maven poms to use <artifactId>poi-ooxml-lite</artifactId>   
> instead of <artifactId>poi-ooxml</artifactId>. This will reduce the  
> distribution size by approximately 10 MB.
>
> Yegor
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
> For additional commands, e-mail: dev-help@poi.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message