lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8714) Implement translation contrib package for LanguageTranslationUpdateProcessor's
Date Sat, 26 Mar 2016 08:36:25 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212885#comment-15212885
] 

Lewis John McGibbney commented on SOLR-8714:
--------------------------------------------

Hi [~teofili] I started a patch which I thought was sound. The blocker right now is SOLR-8716
If we can do the upgrade on Tika then this issue (with Joshua for example backing statistical
machine translation via the language packs we've been generating) then this issue is IMHO
a game changer for the way that Web crawlers harvest and make data available, useful and ultimately
meaningful to us all. If we can get Solr doing statistical machine translation at indexing
time then this is a game changer (of course others are doing it, but for the open source Apache
Solr it would be excellent). 

> Implement translation contrib package for LanguageTranslationUpdateProcessor's
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-8714
>                 URL: https://issues.apache.org/jira/browse/SOLR-8714
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Lewis John McGibbney
>             Fix For: master
>
>
> A while back over in Tika we implemented the [Translator|https://github.com/apache/tika/blob/master/tika-core/src/main/java/org/apache/tika/language/translate/Translator.java]
interface. This now provides a number of [implementations|https://github.com/apache/tika/tree/master/tika-translate/src/main/java/org/apache/tika/language/translate].

> This issue will provide a  translation contrib package offering a LanguageTranslationUpdateProcessor.
> The new processor will probably utilize the existing [Solr Language Identifier|https://github.com/apache/lucene-solr/tree/master/solr/contrib/langid]
and would enable a document to be translated based upon a user defined mapping. The LanguageTranslatorUpdateProcessor's
should be pluggable and would be placed in an UpdateChain the same as the [LanguageIdentifierUpdateProcessor|https://github.com/apache/lucene-solr/blob/master/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java]'s
> It is my intent to also provide a wiki page which can be referenced and maintained in
conjunction with the code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message