tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: FW: Apache Tika used to parse the Panama papers!
Date Wed, 06 Apr 2016 17:34:18 GMT
Yes I read about that too :-)

It would be interesting to hear whether they had any problems, and 
whether they made any support requests, and were these answered 
successfully? Were there any files that failed or did poorly? Or was 
everything so good that no help was needed at all?

I'm delighted that a java product was used, despite that native code 
products would likely have been faster.

Tilman (I'm slightly skeptic about the ICIJ because of the funding and 
the suspicious lack of US data, but as a huge data archeology project, I 
love it!)

Am 06.04.2016 um 19:18 schrieb Allison, Timothy B.:
> Looks like quite a few PDFs [0]...
>
> Couldn't have done it without you!
>
> Cheers,
>
>             Tim
>
> P.S. Tip of the hat to Andreas for rt the link!
>
> [0] https://twitter.com/bigdata/status/717346207312392192
>
> -----Original Message-----
> From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov]
> Sent: Tuesday, April 05, 2016 6:47 PM
> To: dev@tika.apache.org
> Cc: press@apache.org
> Subject: Apache Tika used to parse the Panama papers!
>
> FYI:
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak/?utm_campaign=ForbesTech&utm_source=TWITTER&utm_medium=social&utm_channel=Technology&linkId=23087770#709893771df5
>
>
> BTW I know Thomas and am in touch..he wrote an article about MEMEX last year.
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory
Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor,
Computer Science Department University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


Mime
View raw message