lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pål Brattberg <...@subtree.se>
Subject Best practice schema.xml when importing rich documents?
Date Tue, 06 Dec 2011 14:39:00 GMT
I'm working with SOLR on amainly MS Word, Powerpoint, Excel and PDFs.

Is there a best practice schema.xml and/or solrconfig.xml to use in SOLR when using theExtractingRequestHandler?

I have been doing tweaks to the default schema to attempt to get facets working on date modification
times, but even without that, I figure there could very well exist a good example of how these
files should be when the default output from Tika is enough.

Any pointers are most welcome!

(I also posted this on StackOverflow, so feel free to answer there if you crave points; http://stackoverflow.com/questions/8393417)

Thanks!

/ Pål 
Mime
View raw message