tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1319) Translation
Date Thu, 05 Jun 2014 23:05:02 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019388#comment-14019388
] 

Hudson commented on TIKA-1319:
------------------------------

SUCCESS: Integrated in tika-trunk-jdk1.6 #21 (See [https://builds.apache.org/job/tika-trunk-jdk1.6/21/])
contribution for TIKA-1319: Translation module contributed by Tyler Palsulich. (mattmann:
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1600787)
* /tika/trunk/CHANGES.txt
* /tika/trunk/pom.xml
* /tika/trunk/tika-core/src/main/java/org/apache/tika/Tika.java
* /tika/trunk/tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
* /tika/trunk/tika-core/src/main/java/org/apache/tika/language/translate
* /tika/trunk/tika-core/src/main/java/org/apache/tika/language/translate/DefaultTranslator.java
* /tika/trunk/tika-core/src/main/java/org/apache/tika/language/translate/Translator.java
* /tika/trunk/tika-translate
* /tika/trunk/tika-translate/pom.xml
* /tika/trunk/tika-translate/src
* /tika/trunk/tika-translate/src/main
* /tika/trunk/tika-translate/src/main/java
* /tika/trunk/tika-translate/src/main/java/org
* /tika/trunk/tika-translate/src/main/java/org/apache
* /tika/trunk/tika-translate/src/main/java/org/apache/tika
* /tika/trunk/tika-translate/src/main/java/org/apache/tika/language
* /tika/trunk/tika-translate/src/main/java/org/apache/tika/language/translate
* /tika/trunk/tika-translate/src/main/java/org/apache/tika/language/translate/MicrosoftTranslator.java
* /tika/trunk/tika-translate/src/main/resources
* /tika/trunk/tika-translate/src/main/resources/META-INF
* /tika/trunk/tika-translate/src/main/resources/META-INF/services
* /tika/trunk/tika-translate/src/main/resources/META-INF/services/org.apache.tika.language.translate.Translator
* /tika/trunk/tika-translate/src/main/resources/org
* /tika/trunk/tika-translate/src/main/resources/org/apache
* /tika/trunk/tika-translate/src/main/resources/org/apache/tika
* /tika/trunk/tika-translate/src/main/resources/org/apache/tika/language
* /tika/trunk/tika-translate/src/main/resources/org/apache/tika/language/translator.microsoft.properties
* /tika/trunk/tika-translate/src/test
* /tika/trunk/tika-translate/src/test/java
* /tika/trunk/tika-translate/src/test/java/org
* /tika/trunk/tika-translate/src/test/java/org/apache
* /tika/trunk/tika-translate/src/test/java/org/apache/tika
* /tika/trunk/tika-translate/src/test/java/org/apache/tika/language
* /tika/trunk/tika-translate/src/test/java/org/apache/tika/language/translate
* /tika/trunk/tika-translate/src/test/java/org/apache/tika/language/translate/MicrosoftTranslatorTest.java


> Translation
> -----------
>
>                 Key: TIKA-1319
>                 URL: https://issues.apache.org/jira/browse/TIKA-1319
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Tyler Palsulich
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>             Fix For: 1.6
>
>
> I just opened up a review on reviews.apache.org -- https://reviews.apache.org/r/22219/.
I copied the description below. 
> This patch adds basic language translation functionality to Tika. Translation is provided
by a Microsoft API, but accessed through Apache 2 licensed com.memetix.microsoft-translator-java-api
(https://code.google.com/p/microsoft-translator-java-api/ ). If a user wants to use the translation
feature, they have to add a client id and client secret to the tika-core/src/main/resources/org/apache/tika/language/translator.properties
file (see http://msdn.microsoft.com/en-us/library/hh454950.aspx ). I added com.memetix as
a dependency in tika-core. I put the Translator class in org.apache.tika.language. There is
no integration with the server or CLI, yet. Further, only Strings are translated right now
-- if you pass in a full document with xml tags, the structure will be mangled. But, I think
that would be a cool feature -- translate the body, title, subtitle, etc, but not the structural
elements. 
> There is still more work to do, but I wanted some more eyes on this to make sure I'm
heading in the right direction and this is a desired feature. Let me know what you think!
> There are two simple unit tests for now which translate "hello" to French ("salut").
One for inputting the source and target languages, one for inputing just the target language
(and detecting the source language automatically).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message