tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "robert burrell donkin" <robertburrelldon...@gmail.com>
Subject Re: Support for document libraries
Date Tue, 10 Jul 2007 12:51:11 GMT
On 7/10/07, Carsten Ziegeler <cziegeler@apache.org> wrote:
> Bertrand Delacretaz wrote:
> > On 7/10/07, Carsten Ziegeler <cziegeler@apache.org> wrote:
> >
> >> ... Although Tika is more the framework for plugin in such stuff, it
> >> perhaps
> >> makes sense to try to start something like that as sub projects of
> >> Tika?...
> >
> > I would agree, although IMHO Tika should reuse existing libraries as
> > much as possible.
> >
> Yes, it doesn't make sense to reinvent the wheel if there are
> good-enough libraries out there. But afaik for several formats there
> aren't suitable libs available, so these are the cases where I think
> that it makes sense to "drag them in".

IMHO it makes sense to start them in tika but possibly commons might
be a good long term home for some at least. if these really are
libraries then it would be best to isolate them from the start and
then add adaption code to tika.

for example, there is talk of a couple of possible options for
MIME-type discovery. perhaps it would make sense to factor both
options as libraries and just have the adapters in tika.

- robert

View raw message