tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Moen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-856) Support CJK (Chinese, Japanese and Korean) language detection
Date Sun, 19 Feb 2012 17:46:36 GMT

    [ https://issues.apache.org/jira/browse/TIKA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211440#comment-13211440

Christian Moen commented on TIKA-856:

Thanks, Jan R.  The {{language-detection}} library is similar to that of Tika's and the command
line mentioned in your link and that Jan H. mentions above basically do the same thing.

Jan H., I'll see if I can put together some language profiles for CJK for Tika later this

> Support CJK (Chinese, Japanese and Korean) language detection
> -------------------------------------------------------------
>                 Key: TIKA-856
>                 URL: https://issues.apache.org/jira/browse/TIKA-856
>             Project: Tika
>          Issue Type: New Feature
>          Components: languageidentifier
>    Affects Versions: 1.0
>         Environment: All
>            Reporter: James Sullivan
>              Labels: Chinese, Japanese
> Support language detection of CJK (Chinese, Japanese and Korean).
> Some estimates have Chinese users overtaking English users on the Internet  so it is
important that these languages used by large number of people be supported.
> See TIKA-855

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message