tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Parsing incomplete PDF and Office files
Date Fri, 14 Nov 2008 10:30:51 GMT

On Fri, Nov 14, 2008 at 1:22 AM, Jonathan Koren <jonathan@soe.ucsc.edu> wrote:
> On a related note, does Tika support full text extraction of PDFs?

Yes. See http://incubator.apache.org/tika/formats.html (to be moved to
lucene.apache.org) for all the supported formats.


Jukka Zitting

View raw message