lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Kessel <>
Subject Multiple field / pdf file per document
Date Tue, 25 Aug 2009 00:45:51 GMT

New to Solr, not so new to search.  I have an existing data model that I am pushing into a
Solr index.  For example, I am indexing a product which includes product brochures in multiple
locales.  So this single Solr document contains multiple text fields which require linguistics
analyzers.  The data for these text fields comes from multiple pdf files.  As i am currently
supporting 4 locales, I will have a different pdf file for each locale.    In addition I have
a number of other fields that are used by the application. Solr will be returning a reference
used by the application to determine what data and pdf to display.  With the Extraction Request
handler I don't see how I would be able to support multiple pdf files.  I'm am planning to
use Solr 1.4 for this project.


My assumption is that I will need to do the pdf parsing prior to sending the document to Solr.
 Is there a way to do the extraction at the field level?  I want to make sure I am not missing
something in Solr Cel before I invest the effort to parse the documents on the client side.



Windows Live: Make it easier for your friends to see what you’re up to on Facebook.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message