tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-605) Tika GDAL parser
Date Fri, 10 Oct 2014 06:18:34 GMT

     [ https://issues.apache.org/jira/browse/TIKA-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris A. Mattmann updated TIKA-605:
    Attachment: TIKA-605.Mattmann.100914.2.patch.txt

- ok here is a fully working complete test. Unit tests pass. System.out.printlns removed,
and it handles all metadata now. I had to change the invocation command b/c the ExternalParser
cannot both extract Metadata *and* XHTML output from the same stream. Instead, I carried forward
the ExternalParser's applyPatterns strategy, and am simply calling that locally (since inheritance
was blocked by private methods), and I'm simply using ExternalParser to set up the command
invocation and parsing both the output and the metadata from this myself. Give it a whirl!

> Tika GDAL parser
> ----------------
>                 Key: TIKA-605
>                 URL: https://issues.apache.org/jira/browse/TIKA-605
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>         Environment: indep. of env.
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>              Labels: gdal, gsoc2013, integration, mentor, tika
>             Fix For: 1.7
>         Attachments: 0001-TIKA-605-Tika-GDAL-parser.patch, TIKA-605.Mattmann.092511.patch.txt,
TIKA-605.Mattmann.100914.1.patch.txt, TIKA-605.Mattmann.100914.2.patch.txt
> Leverage the GDAL toolkit and its Java SWIG bindings to create a Tika parser around GDAL.
See here: http://trac.osgeo.org/gdal/browser/trunk/gdal/swig/java/apps/gdalinfo.java

This message was sent by Atlassian JIRA

View raw message