tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: [DISCUSS] Integrate Apache Any23 into Apache Tika
Date Fri, 18 Oct 2013 14:46:25 GMT
Hi Lewis,

I haven't have much time to look into Any23, which includes reviewing Markus's patch for integrating
some portions of that into Tika (see https://issues.apache.org/jira/browse/TIKA-980)

The main challenge I see is that Tika seems to do best as a wrapper for other parsers, versus
outright ownership of parsers.

Which isn't to say that rolling Any23 into Tika wouldn't work, but without at least one active
developer it would seem likely that it would languish, without active development.

But maybe that's OK…

-- Ken

On Oct 18, 2013, at 7:30am, Lewis John Mcgibbney wrote:

> Hi Tika Dev's/PMC,
> 
> This thread is aimed at recognizing common ground shared by Any23 and Tika
> in an attempt to possibly integrate Any23 into Tika.
> First however it will serve a purpose for me to put this into context and
> also provide some rationale behind this initiative.
> 
> It is my understanding that the Tika PMC sponsored Any23 through the Apache
> Incubator until we (the Any23 PMC) were ready to graduate having made an
> incubating release and having grown the community somewhat. Post
> graduation, we made a 0.8.0 release in July 2013.
> 
> It is also my understanding that the logical justification for the Tika PMC
> sponsoring us, was that it was envisaged (by numerous dev's) that there was
> already some common ground between the aim and objectives of both projects
> e.g. mime type detection, parsing, extraction of metadata, serialization,
> etc. therefore with a little positive thinking and understanding of both
> projects, one can clearly see the shared interests.
> 
> I am speaking on behalf of the Any23 community here when I say that we have
> however come to a realization that the community is not as vibrant as we
> would like. This is combined with the fact that initial/original project
> dev's are not around right now to keep the project moving in a forward
> direction.
> 
> It is therefore of interest to us, to approach the Tika community with the
> intention of discussing a proposal to integrate Any23 code into Apache Tika.
> 
> For those interested, the Any23 project URL is http://any23.apache.org, we
> also have a live service which you can use to get a feel for what Any23
> actually does. It can be found at http://any23.org.
> 
> Any feedback from this community would be really appreciated, as it looks
> like the alternative would be for us to take the code into the Apache
> Attic... which is always a last resort.
> 
> Thanks in advance.
> 
> Lewis
> 
> -- 
> *Lewis*

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message