lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shinichiro Abe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-6007) Add param "archive.encoding" for ExtractingRequestHandler
Date Wed, 23 Apr 2014 09:10:16 GMT

     [ https://issues.apache.org/jira/browse/SOLR-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shinichiro Abe updated SOLR-6007:
---------------------------------

    Attachment: japanese-sjis.zip

The unit test using Tika 1.6-dev(trunk) passed.

> Add param "archive.encoding" for ExtractingRequestHandler
> ---------------------------------------------------------
>
>                 Key: SOLR-6007
>                 URL: https://issues.apache.org/jira/browse/SOLR-6007
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Solr Cell (Tika extraction)
>            Reporter: Shinichiro Abe
>            Priority: Minor
>         Attachments: SOLR-6007.patch, japanese-sjis.zip
>
>
> When extracting from the zip files which are zipped at Windows OS(Japanese), the file
name extracted from zip is garbled(these file names were written by CJK language). TIKA-936
allows us to set custom encoding(i.e. SJIS), so I can get not-being garbled file name. It
would be nice if archive encoding parameter in Solr Cell could be specified.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message