tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <mattm...@apache.org>
Subject OSGI expert help from Bob/others: TIKA-2016
Date Wed, 03 May 2017 18:54:12 GMT
Hey Team,

I’m trying to get TIKA-2016 sentiment analysis integrated and having a heck of 
a time fighting tika-bundle and OSGI of which I am not an expert.

See: https://github.com/apache/tika/pull/169/files

Basically what I’m saying:

1. The USC IRDS sentiment analysis parser has  a bunch of Maven 
exclusions in the pom.xml updates to tika-parsers that Thamme made.
This compiles file but failed at tika-bundle.
2. Usually my tika-bundle updates:
a. Include the jar artifactId ref
b. Add a ;resolution:=optional for the package includes
3. Doing #2 usually fixes it. In this case there are a ton of weird exclusions.
I tried to reflect in OSGI tika-bundle/pom.xml as best as I can, I tried 
excluding Solr, handling the tika-serialization inclusion needed, etc., 
and I can get it to the point (if I add sentiment-analysis-parser artifactId
back in) where it gets to the tests, but it fails the tests with:

Running org.apache.tika.bundle.BundleIT
[main] INFO org.ops4j.pax.exam.spi.DefaultExamSystem - Pax Exam System (Version: 4.10.0) created.
[main] INFO org.ops4j.pax.exam.junit.impl.ProbeRunner - creating PaxExam runner for class
[main] INFO org.ops4j.pax.exam.junit.impl.ProbeRunner - running test class org.apache.tika.bundle.BundleIT
INFO  running testBundleSimpleText in reactor
INFO  running testManifestNoJUnit in reactor
INFO  running testTesseractParser in reactor
INFO  running testTikaBundle in reactor
INFO  running testBundleDetection in reactor
INFO  running testBundleDetectors in reactor
INFO  running testBundleLoaded in reactor
INFO  running testForkParser in reactor
INFO  running testBundleParsers in reactor
[main] INFO org.ops4j.pax.exam.spi.reactors.ReactorManager - suite finished
Tests run: 9, Failures: 2, Errors: 5, Skipped: 0, Time elapsed: 7.155 sec <<< FAILURE!

Results :

Failed tests:   testBundleDetectors(org.apache.tika.bundle.BundleIT): Should have several
Detector names, found 2
  testBundleParsers(org.apache.tika.bundle.BundleIT): Should have lots Parser names, found

Tests in error: 
  testBundleSimpleText(org.apache.tika.bundle.BundleIT): org.apache.tika.mime.MediaType not
found by org.apache.tika.bundle [13]
  testTesseractParser(org.apache.tika.bundle.BundleIT): Could not initialize class org.apache.tika.parser.ocr.TesseractOCRParser
  testTikaBundle(org.apache.tika.bundle.BundleIT): Could not initialize class org.apache.tika.parser.pkg.PackageParser
  testBundleDetection(org.apache.tika.bundle.BundleIT): org.apache.tika.mime.MediaType not
found by org.apache.tika.bundle [13]
  testForkParser(org.apache.tika.bundle.BundleIT): org.apache.tika.mime.MediaType not found
by org.apache.tika.bundle [13]

Tests run: 9, Failures: 2, Errors: 5, Skipped: 0

I have no clue why. I’ve messed with including/excluding tika-core, etc, but when I got
stuck in something similar before I just had to step away from it and the fix was not
something I was thinking of ☺

Any help from Bob, Nick or other OSGI gurus is appreciated.


View raw message