tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-1657) Allow easier XML serialization of TikaConfig
Date Thu, 03 Sep 2015 19:38:45 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tim Allison updated TIKA-1657:
    Attachment: TIKA-1657v1.patch

First very rough draft of code...

Will need more input on whether the hierarchical approach makes sense for composite, decorator,
delegating (not yet handled!) parsers.

> Allow easier XML serialization of TikaConfig
> --------------------------------------------
>                 Key: TIKA-1657
>                 URL: https://issues.apache.org/jira/browse/TIKA-1657
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 1.11
>         Attachments: TIKA-1558-blacklist-effective.xml, TIKA-1657v1.patch
> In TIKA-1418, we added an example for how to dump the config file so that users could
easily modify it.  I think we should go further and make this an option at the tika-core level
with hooks for tika-app and tika-server.  I propose adding a main() to TikaConfig that will
print the xml config file that Tika is currently using to stdout.
> I'd like to put this into core so that e.g. Solr's DIH users can get by without having
to download tika-app separately.  
> There's every chance that I've not accounted for issues with dynamic loading etc.  Also,
I'd be ok with only having this available in tika-app and tika-server if there are good reasons.
> Feedback?

This message was sent by Atlassian JIRA

View raw message