lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glock, Thomas" <thomas.gl...@pfizer.com>
Subject RE: Index documents with Solr
Date Thu, 05 Nov 2009 13:43:27 GMT
I have a similar situation but not expecting any easy setup.  Currently the tables contain
both a url to the file and quite a bit of additional metadata about the file.  I'm planning
one initial load to Solr by creating xml in my own utility which posts the xml.  Data is messy
so DIH is not a good choice for this situation.  After the initial load (only ~12K documents
- takes 10 minutes tops); I plan to perform a second pass which will use the extractingrequesthandler.
 I know how the id will map but not clear yet how to get that id to ExtractingRequestHandler.
Would be good to see different examples on the Wiki. Have not yet had a first attempt - hoping
to in a day or so.


-----Original Message-----
From: javaxmlsoapdev [mailto:vikasdp@yahoo.com]
Sent: Wed 04-Nov-2009 5:42 PM
To: solr-user@lucene.apache.org
Subject: Index documents with Solr
 

Wanted to find out how people are using Solr's ExtractingRequestHandler to
index different types of documents from a configuration file in an import
fashion. I want to use this handler in a similar way how DataImportHandler
works where you can issue "import" command from the URL to create an index
reading database table(s). 

For documents, I have a db table which stores files paths. Want to read
file's location from a db table then create an index after reading document
content using ExtractingRequestHandler. Again trying to see if all this can
be done just from a configuration same way how DataImportHandler handles
this

-- 
View this message in context: http://old.nabble.com/Index-documents-with-Solr-tp26205991p26205991.html
Sent from the Solr - User mailing list archive at Nabble.com.



Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message