poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <...@teamware.com>
Subject Text extractor meta data
Date Mon, 05 Jan 2009 02:15:22 GMT
I'm using POI3.5b4 and using ExtractorFactory to get an extractor for various 
types of MS document.  I see the OOXML does not yet support meta data, but for 
the OLE variants I'm having trouble getting the meta data in a simple way.

The only method in the returned POITextExtractor is getText(), which gives a 
line delimeted String of the PID_XXX = value, so I have to parse the strings out 
and match them against the PropertyIDMap names.

Alternatively, I can cast the returned extractor to POIOLE2TextExtractor and 
then get the SI and DSI from there, but I simply then want to get certain 
properties from that.  I don't want to have to write code to do things like 
getAuthor(), as the required properties are driven from external config.

The getProperty() method is protected for some reason, but the getProperties() 
is not.

What's the recommended way to get the properties I want?


To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message