lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject Apache Solr and Tika used to index Panama Papers
Date Wed, 06 Apr 2016 08:45:19 GMT
Hi all,

I just wanted to repost the following by Chris Mattman on the TIKA list:

If you have been following the news you’ve seen the Panama papers and how the world’s
rich and elite have been storing all their money offshore to hide it. Two of the ASF’s key
technologies were used in uncovering that story and showing the world what was going on: Apache
Tika and Apache Solr.

Solr was used for making the Terabytes of Panama Papers available to journalists. The preprocessing
of the documents for indexing was done with Tika (maybe through the contrib/extraction module).

Here is the article by Forbes about that:


Uwe Schindler 
ASF Member, Apache Lucene PMC / Committer
Bremen, Germany

View raw message