tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ann Burgess (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1363) .mat files not parsing
Date Mon, 14 Jul 2014 22:02:05 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061323#comment-14061323
] 

Ann Burgess commented on TIKA-1363:
-----------------------------------

Just pulled most recent tika and I'm still not getting text from the Matlab
parser:
$ svn co http://svn.apache.org/repos/asf/tika/trunk tika
$ mvn install
$ java -jar tika-app/target/tika-app-1.6-SNAPSHOT.jar --text
/Users/IGSWAHWSWBURGESS/Development/tika/tika-parsers/src/test/resources/test-documents/test_mat_text.mat
$

It does seem like the mime-type is recognized:

$ java -classpath
annie-parsers.jar:tika-app/target/tika-app-1.6-SNAPSHOT.jar
org.apache.tika.cli.TikaCLI --detect
/Users/IGSWAHWSWBURGESS/Development/tika/tika-parsers/src/test/resources/test-documents/breidamerkurjokull_radar_profiles_2009.mat
$ application/x-matlab-data

Tyler, did you integrate the patch and get -t and -m output?  Want to make
sure I'm not missing a step.


On Mon, Jul 14, 2014 at 1:16 PM, Chris A. Mattmann (JIRA) <jira@apache.org>




-- 
------------------------------------------------------------------------------------------
Ann Bryant Burgess, PhD

Postdoctoral Fellow
Computer Science Department
University of Southern California
Viterbi School of Engineering
Los Angeles, CA

Alaska Science Center/USGS
Anchorage, AK

Cell:  (585) 738-7549
Office:  (907) 786-7059
Fax:  (907) 786-7150
E-mail: anniebryant.burgess@gmail.com
Office Address: 4210 University Dr., Anchorage, AK 99508-4626
-------------------------------------------------------------------------------------------


> .mat files not parsing
> ----------------------
>
>                 Key: TIKA-1363
>                 URL: https://issues.apache.org/jira/browse/TIKA-1363
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.6
>            Reporter: Ann Burgess
>              Labels: metadata, parser, snapshot
>         Attachments: test_data_1.mat
>
>
> We recently committed a parser for Matlab .mat files, however I've just downloaded the
most recent Tika and am not getting any parsed --text or --metadata for the .mat file used
in the unit test.  The steps I've used are below.  Am I missing something at the command line?
 Can anyone else successfully get a text or metadata output for a .mat file?
> Steps: 
> svn co https://svn.apache.org/repos/asf/tika/trunk tika
> setenv MAVEN_OPTS "-Xms128m -Xmx256m"
> cd tika
> mvn install
> java -jar tika-app/target/tika-app-1.6-SNAPSHOT.jar --text /Users/IGSWAHWSWBURGESS/Development/tika/tika-parsers/src/test/resources/test-documents/breidamerkurjokull_radar_profiles_2009.mat



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message