tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: 1.18 pre rc regression tests
Date Wed, 28 Mar 2018 19:54:32 GMT
Still waiting for reports...

We've had quite a few files go from application/x-123 to image/x-tga via TIKA-2527.

I think this is expected because they all appear to be embedded files, with file names that
end in .tga. But I wanted to confirm this is expected.

There's also one example of: application/x-stata-dta -> image/x-tga, which is probably
wrong:

http://162.242.228.174/docs/commoncrawl2_likely_broken/BT/BTTVHEUDLE7WODDGPYT6LLA6LXMHS3CX.dta




-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Wednesday, March 28, 2018 10:55 AM
To: dev@tika.apache.org
Subject: 1.18 pre rc regression tests

All,
I've run the initial regression tests.  The corpus size is now big enough that I have to migrate
the H2 tables to postgres before writing the reports.  I'll post the reports as soon as they're
finally ready, but I'm starting to go through some results now.

Cheers,

                Tim


Mime
View raw message