lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Kelleher <>
Subject ExtractingRequestHandler and HTML
Date Mon, 12 Dec 2011 13:06:51 GMT
I am submitting HTML document to Solr using the ERH.  Is it possible to 
store the contents of the document (including all markup) into a field?  
Using fmap.content (I am assuming this comes from Tika) stores the 
extracted text of the document in a field, but not the markup.  I want 
the whole un-altered document.

Is this possible?



View raw message