tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1508) Add uniformity to parser parameter configuration
Date Sun, 14 Aug 2016 19:49:20 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420462#comment-15420462
] 

Hudson commented on TIKA-1508:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1092 (See [https://builds.apache.org/job/Tika-trunk/1092/])
TIKA-1508 add some unit tests for ParameterizedParserTest (tallison: rev 18ab8f91f19b105270a107174afd64fe81d20f72)
* (add) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-bad-types.xml
* (delete) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-parametrized.xml
* (add) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-some-parameters.xml
* (add) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-bad-parameters.xml
* (add) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-bad-values.xml
* (add) tika-core/src/test/resources/org/apache/tika/config/TIKA-1986-parameterized.xml
* (edit) tika-core/src/test/java/org/apache/tika/parser/ConfigurableParserTest.java
* (delete) tika-core/src/test/java/org/apache/tika/parser/DummyParametrizedParser.java
* (delete) tika-core/src/test/java/org/apache/tika/parser/ParametrizedParserTest.java
* (add) tika-core/src/test/java/org/apache/tika/parser/DummyParameterizedParser.java
* (add) tika-core/src/test/java/org/apache/tika/parser/ParameterizedParserTest.java
TIKA-1508 proof of concept with on parameter on PDFParser (tallison: rev 853750d47fa99afad0df6d4f4727a35c96675254)
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
* (add) tika-parsers/src/test/resources/org/apache/tika/parser/pdf/tika-config.xml
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
updated with the new changes of TIKA-1508 (thammegowda: rev 31cf12d51539c1ec01c12d73f96e7eadfd2339f0)
* (edit) tika-core/src/main/java/org/apache/tika/utils/AnnotationUtils.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/recognition/tf/TensorflowImageRecParser.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/recognition/ObjectRecognitionParser.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/recognition/tf/TensorflowImageRecParserTest.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/recognition/ObjectRecogniser.java
Update changes for TIKA-1508, TIKA-1993, TIKA-1986. (mattmann: rev da82df5e9def9698fd32f85fe706660641d7c31f)
* (edit) CHANGES.txt


> Add uniformity to parser parameter configuration
> ------------------------------------------------
>
>                 Key: TIKA-1508
>                 URL: https://issues.apache.org/jira/browse/TIKA-1508
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Chris A. Mattmann
>             Fix For: 1.14
>
>
> We can currently configure parsers by the following means:
> 1) programmatically by direct calls to the parsers or their config objects
> 2) sending in a config object through the ParseContext
> 3) modifying .properties files for specific parsers (e.g. PDFParser)
> Rather than scattering the landscape with .properties files for each parser, it would
be great if we could specify parser parameters in the main config file, something along the
lines of this:
> {noformat}
>     <parser class="org.apache.tika.parser.audio.AudioParser">
>       <params>
>         <int name="someparam1">2</int>
>         <str name="someOtherParam2">something or other</str>
>       </params>
>       <mime>audio/basic</mime>
>       <mime>audio/x-aiff</mime>
>       <mime>audio/x-wav</mime>
>     </parser>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message