tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Bennett <keithrbenn...@gmail.com>
Subject Questions
Date Mon, 09 Sep 2019 12:21:42 GMT
Hello, everyone. I am a Tika committer but have not been active for a long time. I've been
looking over the code and would appreciate if you could answer some questions:

1) There is a Jira issue (at https://issues.apache.org/jira/browse/DRILL-6256?jql=text%20~%20%22readme%20java%207%22)
regarding the mention of Java 1.7 in the README (https://github.com/apache/tika/blob/master/README.md).
It was marked as fixed, but I still see Java 7 mentioned. Tika should work with the most recent
versions of Java, right? Should we not update the readme accordingly? I noticed that there
is a "tika-java7" directory in the project consisting solely of a TikaFileTypeDetector class.
Can you help me understand what the connection with Java version 7 is? Is it that Tika code
should not use features that were absent in Java 7 (such as lambdas)?

2) I would like to bring "Rika" (https://github.com/ricn/rika), a Ruby wrapper around Tika,
up to date with respect to the dependency jar files packaged with it. I thought I would check
out the commit to which the 1.22 tag was attached, and do a fresh maven install, and use the
files that were installed ("~/.m2/repository/**/*jar"). Then again, Rika unconditionally loads
all the jar files; would it be faster to just use the jar file of the Tika distribution (e.g.
tika-app-1.22.jar) so that only one instead of n files needs to be loaded? 

3) The description for the Github repo at https://github.com/apache/tika says "Tika Mirror".
Is it really a mirror, or has it become the authoritative source? (Given that I saw mentions
of pull requests, I suspect the latter.) If the latter, I suggest changing that text to something
like "Tika Authoritative Repository", as it is currently misleading.


Keith R. Bennett

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message