tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (TIKA-125) Pass Locale information to parsers
Date Mon, 14 Dec 2009 23:54:18 GMT

     [ https://issues.apache.org/jira/browse/TIKA-125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jukka Zitting resolved TIKA-125.

       Resolution: Fixed
    Fix Version/s: 0.6
         Assignee: Jukka Zitting

The parse context mechanism is perfect for this need. Use the following code to specify the
default locale to be used when formatting data from documents like Excel sheets that don't
contain explicit locale information:

    Parser parser = ...;
    ParseContext context = new ParseContext();
    context.set(Locale.class, myLocale);
    parser.parse(..., context);

> Pass Locale information to parsers
> ----------------------------------
>                 Key: TIKA-125
>                 URL: https://issues.apache.org/jira/browse/TIKA-125
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 0.6
> Looking at TIKA-103 I realized that some file formats can contain data whose text rendering
depends on the active Locale which might not be explicitly specified in the file format or
the specific document being parsed.
> It should be possible for a parser client to explicitly specify which Locale should be
used as the default when extracting text from a document. Setting the global default with
Locale.setLocale() is not an option in many cases.
> I think the best way to pass Locale information to a parser is as a part of the Metadata

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message