tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: [DISCUSS] Give examples of Parser, Detector, and Translator usage
Date Thu, 07 Aug 2014 21:54:05 GMT
Hey Nick! :)

I'd have no problem pinching the code from Tika in Action. I wonder if
the Manning folks would mind.

I'll reach out to them.

Cheers,
CHris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Nick Burch <apache@gagravarr.org>
Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
Date: Thursday, August 7, 2014 2:42 PM
To: "dev@tika.apache.org" <dev@tika.apache.org>
Subject: Re: [DISCUSS] Give examples of Parser, Detector, and Translator
usage

>On Thu, 7 Aug 2014, Tyler Palsulich wrote:
>> Sounds like the new module is a good idea. So, let's jump on it! I will
>> create a new 'example' JIRA tag and create issues for creating the
>> module and adding Parse, Detect, and Translate examples. Others should
>> add issues/desired examples as they see fit. How's that sound?
>
>I wonder if it's worth approaching those crazy fools who wrote a book on
>Tika, to see if we could pinch one or two of their examples? If only we
>knew who they were... ;-)
>
>
>Recursion is one that causes confusion, we've got some example programs
>on 
>the wiki that we can include:
>https://wiki.apache.org/tika/RecursiveMetadata
>
>Ray Gauss is probably our best bet for advanced metadata stuff to send in
>some examples on that!
>
>Another one that has generated mailing list traffic lately is embedded
>images, including re-writing links to them. There's some (LGPL) code in
>Alfresco which I wrote a few years ago to do that, Ray might be able to
>get the nod to contribute that (or a cut-down version) as an example of
>that style of parsing html + embedded resources in parallel
>
>Nick


Mime
View raw message