tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1363) .mat files not parsing
Date Tue, 08 Jul 2014 14:10:05 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054966#comment-14054966
] 

Ken Krugler commented on TIKA-1363:
-----------------------------------

For issues like these (where it could be a problem with your environment, etc) it's best to
first post to the mailing list, and then (after researching the issue) if it really looks
like a bug, you can create a Jira issue. Also you'll get more visibility and likely better
response from the mailing list.

Having said that, I tried it myself, and I do get (minimal) metadata - the Content-Type is
application/x-matlab-data, but there's no text.

> .mat files not parsing
> ----------------------
>
>                 Key: TIKA-1363
>                 URL: https://issues.apache.org/jira/browse/TIKA-1363
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.6
>            Reporter: Ann Burgess
>              Labels: metadata, parser, snapshot
>
> We recently committed a parser for Matlab .mat files, however I've just downloaded the
most recent Tika and am not getting any parsed --text or --metadata for the .mat file used
in the unit test.  The steps I've used are below.  Am I missing something at the command line?
 Can anyone else successfully get a text or metadata output for a .mat file?
> Steps: 
> svn co https://svn.apache.org/repos/asf/tika/trunk tika
> setenv MAVEN_OPTS "-Xms128m -Xmx256m"
> cd tika
> mvn install
> java -jar tika-app/target/tika-app-1.6-SNAPSHOT.jar --text /Users/IGSWAHWSWBURGESS/Development/tika/tika-parsers/src/test/resources/test-documents/breidamerkurjokull_radar_profiles_2009.mat



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message