tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files
Date Wed, 13 Apr 2016 20:22:25 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239960#comment-15239960

Nick C commented on TIKA-1513:

bq. Well, you know there's still plenty of time to get that into Tika 2.0

Maybe I'll add that to my to do list. I have been wanting to work on improving the RTF parser
to handle tables/html and generate valid xhtml (multiple lists seem to cause issues)

bq. Ballpark, how many dbfs do you have to dev with? Do you want some from our test corpus?

At least 200. I would like more to test with though.

> Add mime detection and parsing for dbf files
> --------------------------------------------
>                 Key: TIKA-1513
>                 URL: https://issues.apache.org/jira/browse/TIKA-1513
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 1.13
> I just came across an Apache licensed dbf parser that is available on [maven|https://repo1.maven.org/maven2/org/jamel/dbf/dbf-reader/0.1.0/dbf-reader-0.1.0.pom].
> Let's add dbf parsing to Tika.
> Any other recommendations for alternate parsers?

This message was sent by Atlassian JIRA

View raw message