tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: A problem in the right-to-left languages
Date Tue, 01 Nov 2011 13:14:14 GMT

On Tue, Nov 1, 2011 at 1:48 PM, Robert Muir <rcmuir@gmail.com> wrote:
> I really think tika should include the parts of icu4j it depends on.
> Often open source projects are hesitant to include icu jar because of
> its size, but thats silly since the size is just a catch-all.
> We can use the webapp to make a smaller one that includes the minimum
> of stuff Tika needs. http://apps.icu-project.org/datacustom/

We need a version that's available on the central Maven repository.

> Maybe we should open a JIRA issue to fix this? I think its a bug that
> Arabic and Persian text silently come out corrupted if you don't have
> this in your classpath.



Jukka Zitting

View raw message