tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-596) NetCDF and HDF files don't parse correctly from the command line via tika-app
Date Mon, 14 Feb 2011 13:31:57 GMT

    [ https://issues.apache.org/jira/browse/TIKA-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994288#comment-12994288

Jukka Zitting commented on TIKA-596:

After thinking about this a bit, I believe an architecturally better solution for this would
be for the parsers to always output XHTML even if empty. This is better in line with the expectations
set in the Parser javadoc and prevents the need for special case code like the one you added
in revision 1070359.

Adding empty XHTML output is as simple as this:

XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);

> NetCDF and HDF files don't parse correctly from the command line via tika-app
> -----------------------------------------------------------------------------
>                 Key: TIKA-596
>                 URL: https://issues.apache.org/jira/browse/TIKA-596
>             Project: Tika
>          Issue Type: Bug
>          Components: packaging, parser
>    Affects Versions: 0.8
>         Environment: while prepping 0.9 RC
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>            Priority: Blocker
>              Labels: cmd, error, hdf, line, netcdf, packaging, tika-app
>             Fix For: 0.9
> The tika-app command line interface seems to be broken for HDF and NetCDF files. For
> {noformat}
> [chipotle:trunk/tika-app/target] mattmann% java -jar tika-app-0.9-SNAPSHOT.jar -m /Users/mattmann/src/tika/trunk/tika-parsers/target/test-classes/test-documents/test.he5
> [chipotle:trunk/tika-app/target] mattmann% 
> {noformat}
> and:
> {noformat}
> [chipotle:trunk/tika-app/target] mattmann% java -jar tika-app-0.9-SNAPSHOT.jar -m /Users/mattmann/src/tika/tags/0.8/tika-parsers/src/test/resources/test-documents/sresa1b_ncar_ccsm3_0_run1_200001.nc
> [chipotle:trunk/tika-app/target] mattmann% 
> {noformat}

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message