lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Indexing PDF files with Solr 6.6 while allowing highlighting matched text with context
Date Wed, 21 Jun 2017 00:53:03 GMT
>http -  however, the big advantage of doing your indexing on different machine is that
the heavy lifting that tika does in extracting text from documents, finding metadata etc is
not happening on the server. If the indexer crashes, it doesn’t affect Solr either.

+1 

for what can go wrong: http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf


https://www.youtube.com/watch?v=vRPTPMwI53k&t=13s&index=43&list=PLbzoR-pLrL6pLDCyPxByWQwYTL-JrF5Rp

Really, we try our best on Tika, but sometimes bad things happen.  Let us know when they do,
and we'll try to fix them.
Mime
View raw message