tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2569) Grouped Text boxes in .ppt
Date Fri, 02 Mar 2018 21:10:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384155#comment-16384155

Tim Allison commented on TIKA-2569:

[~BAEApache], if all goes according to plan, we'll start the release process in about a week.
 The process itself can take a week or two.  You can follow our discussion on our dev list:

If you'd like to test 1.18 vs 1.17, you can grab a nightly build from jenkins, e.g. [here|https://builds.apache.org/job/Tika-trunk/1442/org.apache.tika$tika-app/]
and use tika-eval to run comparisons: https://wiki.apache.org/tika/TikaEval  .  Let us know
if you find any regressions before the 1.18 release!

> Grouped Text boxes in .ppt
> --------------------------
>                 Key: TIKA-2569
>                 URL: https://issues.apache.org/jira/browse/TIKA-2569
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.16
>            Reporter: Richard A
>            Assignee: Tim Allison
>            Priority: Major
>              Labels: easyfix
>             Fix For: 1.18, 2.0.0
>         Attachments: Presentation1.ppt, Presentation1.pptx
> Grouped Text boxes are unable to be parsed and no content is returned when items have
been grouped together. This issue does not seem to affect .pptx files, only .ppt. The attached
documents are the same except the file format. It should give a very simple example of a .ppt
document where no content will be returned.

This message was sent by Atlassian JIRA

View raw message