poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 51803] Content on master slide is not extracted
Date Sat, 05 Nov 2011 10:55:13 GMT

mikemccand <lucene@mikemccandless.com> changed:

           What    |Removed                     |Added
             Status|RESOLVED                    |REOPENED
         Resolution|WORKSFORME                  |

--- Comment #2 from mikemccand <lucene@mikemccandless.com> 2011-11-05 10:55:13 UTC ---
I think there is still a problem here: with the example PPT I
attached, I see boiler-plate text when I run PowerPointExtract (which
does set to flag to include master slide text, in its static main

I see code in HSLF for detecting that a given Shape is a placeholder
(MasterSheet.isPlaceholder), so it seems possible we can avoid
extracting such text?  But I'm not familiar enough with the APIs, eg
when Sheet.findTextRuns is invoked for a MasterSlide, how can it get
the Shape for each run and then skip its text if it's a placeholder?

Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message