nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Nioche <lists.digitalpeb...@gmail.com>
Subject Re: Integration with Tika
Date Thu, 12 Nov 2009 14:41:11 GMT
Speaking of which, I'm planning to do some work on the Tika integration
within the next week or so. Basically, I'll create a new plugin which will
be used for the mime types that Tika can already handle while keeping some
of the existing plugins for the more complex cases. This should allow us to
already have a first version of the Tika integration without losing any the
functionalities. Will update the list as soon as I have something working +
will create a JIRA

J.
-- 
DigitalPebble Ltd
http://www.digitalpebble.com

2009/11/10 Andrzej Bialecki <ab@getopt.org>

> BrunoWL wrote:
>
>> Hi. i'm a benning in nutch. Can anybody tell how to make nutch use parsers
>> from tika.
>> I did all kind of search and didn't find a answer.
>>
>
> Tika parsers are not integrated yet with Nutch - we use our own parsers,
> and in most cases they are of similar quality as those in Tika (since most
> Tika parsers originated in Nutch). Tight Tika integration is on the roadmap.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>

Mime
View raw message