poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 54937] New: Strange author table structures in word documents failing the text extraction entirely.
Date Wed, 08 May 2013 13:27:31 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=54937

            Bug ID: 54937
           Summary: Strange author table structures in word documents
                    failing the text extraction entirely.
           Product: POI
           Version: unspecified
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HWPF
          Assignee: dev@poi.apache.org
          Reporter: shu.yang@icims.com
    Classification: Unclassified

Here's the stack trace of the exception.

Caused by: java.lang.UnsupportedOperationException: Non-extended character
Pascal strings are not supported right now. Please, contact POI developers for
update.
at org.apache.poi.hwpf.model.SttbUtils.read(SttbUtils.java:66)
at org.apache.poi.hwpf.model.SttbUtils.readSttbSavedBy(SttbUtils.java:116)
at org.apache.poi.hwpf.model.SavedByTable.<init>(SavedByTable.java:53)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:360)
at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:80)

This happens in Tika 1.3

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message