tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Nioche <lists.digitalpeb...@gmail.com>
Subject Pluggable language detection
Date Wed, 21 Mar 2012 15:51:54 GMT
Hi guys,

Just wondering about the best way to make the language detection pluggable
instead of having it hard-wired as it is now. We now that the resources
that are currently in Tika are both slow and inaccurate [1] and there are
other libraries that we could leverage. Why not having the option to select
a different implementation just like we do for parsers? Obviously we'd need
a common interface for the parsers etc...

What do you think?



*Open Source Solutions for Text Engineering


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message