tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (TIKA-1730) Excel to HTML filtering seems to produce some font setting gibberish in output
Date Mon, 04 Jan 2016 15:29:39 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Allison reassigned TIKA-1730:
---------------------------------

    Assignee: Tim Allison

> Excel to HTML filtering seems to produce some font setting gibberish in output
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-1730
>                 URL: https://issues.apache.org/jira/browse/TIKA-1730
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Matt Sheppard
>            Assignee: Tim Allison
>
> Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below, which used
to filter pretty normally, now produces the following...
> {noformat}
> <div class="outside">&amp;C&amp;"Arial,Bold"&amp;11&amp;F</div>
> {noformat}
> ...seemingly at the end of the first sheet's output when filtered with {{java -jar tika-app-1.10.jar
funnelback-claim-form-with-expense-codes.xls}}.
> It looks like some styling information which should not be getting displayed as text
here.
> Would be nice if that could be fixed in some future version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message