tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1816) Lenient testing for NamedEntityParser
Date Mon, 11 Jan 2016 14:02:40 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091932#comment-15091932
] 

Tim Allison commented on TIKA-1816:
-----------------------------------

Y.  Works.  Thank you!

{noformat}
Using the first Proxy setting : null@ something.or.other.org : XX
Proxy is configured
GET : http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin -> tika-parsers\src\test\resources\org\apache\tika\parser\ner\opennlp\ner-person.bin
(Using proxy? true)
10.2388212797% : 533233 bytes of 5207953
45.2644829936% : 2357353 bytes of 5207953
84.3405652854% : 4392417 bytes of 5207953
Copy complete.
Download Complete..
GET : http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin -> tika-parsers\src\test\resources\org\apache\tika\parser\ner\opennlp\ner-location.bin
(Using proxy? true)
40.0848188237% : 2048598 bytes of 5110658
65.4000717716% : 3342374 bytes of 5110658
67.4921702841% : 3449294 bytes of 5110658
Copy complete.
Download Complete..
GET : http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin -> tika-parsers\src\test\resources\org\apache\tika\parser\ner\opennlp\ner-organization.bin
(Using proxy? true)
39.3755384949% : 2085790 bytes of 5297172
75.4165052598% : 3994942 bytes of 5297172
Copy complete.
Download Complete..
GET : http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin -> tika-parsers\src\test\resources\org\apache\tika\parser\ner\opennlp\ner-date.bin
(Using proxy? true)
43.0595985494% : 2166030 bytes of 5030307
84.5145634253% : 4251342 bytes of 5030307
Copy complete.
Download Complete..
{noformat}
...snip...
{noformat}Running org.apache.tika.parser.ner.NamedEntityParserTest
11 Jan 2016 09:01:15  INFO NamedEntityParser - going to load, instantiate and bind the instance
of org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
11 Jan 2016 09:01:16  INFO OpenNLPNameFinder - LOCATION NER : Available for service ? true
11 Jan 2016 09:01:16  INFO OpenNLPNameFinder - ORGANIZATION NER : Available for service ?
true
11 Jan 2016 09:01:17  INFO OpenNLPNameFinder - DATE NER : Available for service ? true
11 Jan 2016 09:01:17  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-money.bin
using class loader
11 Jan 2016 09:01:17  INFO OpenNLPNameFinder - MONEY NER : Available for service ? false
11 Jan 2016 09:01:17  INFO OpenNLPNameFinder - PERSON NER : Available for service ? true
11 Jan 2016 09:01:17  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-percentage.bin
using class loader
11 Jan 2016 09:01:17  INFO OpenNLPNameFinder - PERCENT NER : Available for service ? false
11 Jan 2016 09:01:17  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-time.bin
using class loader
11 Jan 2016 09:01:17  INFO OpenNLPNameFinder - TIME NER : Available for service ? false
11 Jan 2016 09:01:17  INFO NamedEntityParser - org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
is available ? true
11 Jan 2016 09:01:17  INFO NamedEntityParser - going to load, instantiate and bind the instance
of org.apache.tika.parser.ner.regex.RegexNERecogniser
11 Jan 2016 09:01:17  INFO NamedEntityParser - org.apache.tika.parser.ner.regex.RegexNERecogniser
is available ? true
11 Jan 2016 09:01:17  INFO NamedEntityParser - Number of NERecognisers in chain 2
11 Jan 2016 09:01:17  INFO NamedEntityParser - going to load, instantiate and bind the instance
of org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
11 Jan 2016 09:01:18  INFO OpenNLPNameFinder - LOCATION NER : Available for service ? true
11 Jan 2016 09:01:18  INFO OpenNLPNameFinder - ORGANIZATION NER : Available for service ?
true
11 Jan 2016 09:01:19  INFO OpenNLPNameFinder - DATE NER : Available for service ? true
11 Jan 2016 09:01:19  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-money.bin
using class loader
11 Jan 2016 09:01:19  INFO OpenNLPNameFinder - MONEY NER : Available for service ? false
11 Jan 2016 09:01:19  INFO OpenNLPNameFinder - PERSON NER : Available for service ? true
11 Jan 2016 09:01:19  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-percentage.bin
using class loader
11 Jan 2016 09:01:19  INFO OpenNLPNameFinder - PERCENT NER : Available for service ? false
11 Jan 2016 09:01:19  WARN OpenNLPNameFinder - Couldn't find model from org/apache/tika/parser/ner/opennlp/ner-time.bin
using class loader
11 Jan 2016 09:01:19  INFO OpenNLPNameFinder - TIME NER : Available for service ? false
11 Jan 2016 09:01:19  INFO NamedEntityParser - org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
is available ? true
11 Jan 2016 09:01:19  INFO NamedEntityParser - going to load, instantiate and bind the instance
of org.apache.tika.parser.ner.regex.RegexNERecogniser
11 Jan 2016 09:01:19  INFO NamedEntityParser - org.apache.tika.parser.ner.regex.RegexNERecogniser
is available ? true
11 Jan 2016 09:01:19  INFO NamedEntityParser - Number of NERecognisers in chain 2
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.373 sec - in org.apache.tika.parser.ner.NamedEntityParserTest
Running org.apache.tika.parser.ner.regex.RegexNERecogniserTest
11 Jan 2016 09:01:19  INFO NamedEntityParser - going to load, instantiate and bind the instance
of org.apache.tika.parser.ner.regex.RegexNERecogniser
11 Jan 2016 09:01:19  INFO NamedEntityParser - org.apache.tika.parser.ner.regex.RegexNERecogniser
is available ? true
11 Jan 2016 09:01:19  INFO NamedEntityParser - Number of NERecognisers in chain 1
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec - in org.apache.tika.parser.ner.regex.RegexNERecogniserTest
{noformat}
And then the tests are run and the build works.  Thank you!

> Lenient testing for NamedEntityParser
> -------------------------------------
>
>                 Key: TIKA-1816
>                 URL: https://issues.apache.org/jira/browse/TIKA-1816
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Thamme Gowda N
>            Assignee: Chris A. Mattmann
>              Labels: memex
>             Fix For: 1.12
>
>         Attachments: TIKA-1816-proxy-fix.patch
>
>
> NamedEntityParser has a hard setup requirement like downloading of NER models from remote
servers and adding them to classpath.
> These model files are huge and hence are not added to source control.
> So, the tests are most likely to fail in various environments.
> Make the best effort to set up the tests, but in the worst case skip tests instead of
failing the whole build process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message