tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1690) nconsistent (buggy) behavior when using tika-server
Date Fri, 17 Jul 2015 13:21:04 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631326#comment-14631326

Chris A. Mattmann commented on TIKA-1690:

hey [~tallison@apache.org] we should probably be more consistent about using TikaUtils.getInputStream
in those multipart methods.

> nconsistent (buggy) behavior when using tika-server 
> ----------------------------------------------------
>                 Key: TIKA-1690
>                 URL: https://issues.apache.org/jira/browse/TIKA-1690
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Namrata Malarout
>            Assignee: Tim Allison
> I am using Tika trunk (1.10-SNAPSHOT) and posting documents there. An example would be
the following:
> curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif  http://localhost:9998/meta
--header "Accept: application/json”
> …
> curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif  http://localhost:9998/meta
--header "Accept: application/rdf+xml”
> …
> curl -T MOD09GA.A2014010.h30v12.005.2014012183944.vegetation_fraction.tif  http://localhost:9998/meta
--header "Accept: text/csv”
> I am using a python script to iterate through all the files in a folder. It works for
about 50% to 80% of the files. For the rest it gives an error 500. When I post a file individually
for which it previously failed (using the python script) it sometimes works. When done in
an ad hoc manner, it works most of the time but fails sometimes. At times it is successful
for application/rdf+xml format but fails for application/json format. The behavior is inconsistent.
> Here is an example trace of when it does not work as expected [0]
> A sample of the data being used can be found here [1]
> Any help would be appreciated. 
> [0] https://paste.apache.org/lbAm
> [1] https://drive.google.com/file/d/0B6wmo4_-H0P2eWJjdTdtYS1HRGs/view?usp=sharing

This message was sent by Atlassian JIRA

View raw message