tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1332) Create "eval" code
Date Fri, 10 Feb 2017 18:45:41 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861665#comment-15861665
] 

Tim Allison commented on TIKA-1332:
-----------------------------------

Some more work is required, but I think tika-eval is getting close to being ready to commit.
 

If anyone has a chance to review, code is on my [github fork|https://github.com/tballison/tika/tree/TIKA-1302]
and the beginnings of wiki documentation are now up on our [wiki|https://wiki.apache.org/tika/TikaEval].

Thank you!

> Create "eval" code
> ------------------
>
>                 Key: TIKA-1332
>                 URL: https://issues.apache.org/jira/browse/TIKA-1332
>             Project: Tika
>          Issue Type: Sub-task
>          Components: cli, general, server
>            Reporter: Tim Allison
>         Attachments: comparison_reports.xml
>
>
> For this issue, we can start with code to gather statistics on each run (# of exceptions
per file type, most common exceptions per file type, number of metadata items, total text
extracted, etc).  We should also be able to compare one run against another.  Going forward,
there's plenty of room to improve.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message