lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <>
Subject [jira] Commented: (SOLR-2244) Add Language Identification support
Date Fri, 03 Dec 2010 21:46:13 GMT


Grant Ingersoll commented on SOLR-2244:

I'm going to move forward with this patch, since I don't see one for SOLR-1979.  

I'm going to keep it in contrib/langid, but have it use the Tika libs from contrib/extraction,
so that we won't have to package them twice.  I don't really like changing contrib/extraction
to be contrib/tika since then it is not clear what the functionality is and we also may have
other lang. id tools in the future.

> Add Language Identification support
> -----------------------------------
>                 Key: SOLR-2244
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>         Attachments: solr2244.patch
> For starters, Tika has language identification capabilities that we can likely leverage,
but moreover, make it easier for people to plug in language identification into the indexing

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message