tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rida Benjelloun" <rida.benjell...@doculibre.com>
Subject Re: Tika use cases
Date Fri, 24 Aug 2007 20:39:19 GMT
Hi Jukka,
I agree with your use case.
+1.
Regards.

On 8/17/07, Jukka Zitting <jukka.zitting@gmail.com> wrote:
>
> Hi,
>
> I was thinking about ways to best model the Tika interfaces, and it
> seems to me that the only sane way to do that is to start with use
> cases and how a client would most naturally use a toolkit like Tika.
> Here are some of my initial ideas for review, feel free to add more
> cases or suggest alternatives.
>
> 1) Extract structured text content from a stream (default configuration):
>
>     InputStream stream = ...;
>     ContentHandler handler = ...; // SAX event handler
>     new SomeTikaClass().parse(stream, handler);
>
> 2) Set configuration options:
>
>     SomeTikaClass tika = new SomeTikaClass();
>     tika.setConfigurationOption1(...);
>     tika.setConfigurationOption2(...);
>     // also composition, etc.
>
> 3) Extract metadata from a stream:
>
>     InputStream stream = ...;
>     Metadata metadata = new Metadata(); // Metadata container
>     new SomeTikaClass.parse(stream, metadata);
>
> 4) Provide external metadata as input for parsing:
>
>     InputStream stream = ...;
>     ContentHandler handler = ...;
>     Metadata metadata = new Metadata();
>     metadata.setFileName(...);
>     new SomeTikaClass.parse(stream, handler, metadata);
>
> BR,
>
> Jukka Zitting
>



-- 
---------------------------------------------------------
Rida Benjelloun
Doculibre inc.
ridabenjelloun@apache.org
rida.benjelloun@doculibre.com
Cel: 418-262-3222
Tel: 418-353-3390
Site Web : http://www.doculibre.com
---------------------------------------------------------

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message