tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Burrell Donkin <robertburrelldon...@gmail.com>
Subject Mime Detection
Date Thu, 21 May 2009 17:48:17 GMT
the documentation could do with an explanation of mime typing best
practice. i'm create a patch once i'm sure i understand it...

please jump in with corrections

- robert


A. from the basic user perspective, the quick start way to mime type is to

1. Use MimeTypesFactory#createMimeTypes() to create a MimeTypes with
the default tika configuration
2. if you want just name based heuristics call getMimeType passing a
file, url or name
3. if you want full typing heuristics including magic call getMimeType
passing an input stream

B. from an advanced user perspective, the heuristics can be customised by

1.passing a different configuration file to
2 & 3 as above

C. developers of new detectors should take a look at the detector
interface and then customise as above

if B or C then the tika team would be very interested in contributions


View raw message