tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Working on a new Translation plugin using Joshua
Date Tue, 17 Jun 2014 17:12:01 GMT
So, thanks again to Tyler for giving us our Translator interface.
Before I get too deep into building a TranslatingParser,
TranslatingContentHandler
and other goodies I have in the pipeline, I'd like to get us another
implementation of the Translator interface that doesn't depend on
an external service. After searching around for permissively licensed
"machine translation" APIs and reading some papers ;), I found:

https://github.com/joshua-decoder/joshua


This seems to be pretty awesome, BSD licensed, and written in Java, and I
am testing
it right now on some sample data from DARPA XDATA and the project
there. The only rub is that it's not in the Central repository. I have
reached out to @mjpost one of the authors and sent info on how to
get it published and I am going to work in my Github fork of the project
now to prepare the pom.xml and other stuff to get it ready for publishing
into Central.

In the meanwhile I should have a review board patch up soon too for
the JoshuaTranslator.

Cheers!

Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





Mime
View raw message