tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Gullion (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1440) Auto-Paragraph numbers not extracted from Word Document
Date Thu, 09 Oct 2014 16:58:33 GMT
Steve Gullion created TIKA-1440:
-----------------------------------

             Summary: Auto-Paragraph numbers not extracted from Word Document 
                 Key: TIKA-1440
                 URL: https://issues.apache.org/jira/browse/TIKA-1440
             Project: Tika
          Issue Type: Bug
          Components: parser
         Environment: Windows 7, Windows Server 2008, Tomcat
            Reporter: Steve Gullion
            Priority: Minor


When the text is extracted from a Microsoft Word document that uses automatic numbering, the
text of the automatic numbers is not extracted. As the numbers can be critical to the meaning
of the document (as in the case of cross-references), they should be calculated and extracted
if at all possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message