tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: got docx?
Date Mon, 12 Dec 2016 14:57:55 GMT
To close the loop and share my gratitude publicly...

Thank you, Dominik, for transferring 41k, 5GB of docx/dotx to our regression corpus!

I’ve already found a number of “areas for improvement” in Tika's experimental docx SAX
parser, and a few areas for improvement in POI's XWPFDocument/DOM parser…all thanks to your
documents and your common crawl code.  

Thank you!



View raw message