lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rajan chandi <>
Subject Re: Using SolrJ with Tika
Date Wed, 02 Sep 2009 14:13:22 GMT

Check-out Solr 1.4.

You can download the trunk and Build it on your box.

The Solr 1.4 does this out-of-the-box. No configuration required.

You can use HTTP POST to post the document using some Linux utility like
Curl and the PDF/Word/RTF/PPT/XLS etc. will be indexed. We tested this last

Tika has already been included in Solr 1.4.


On Wed, Sep 2, 2009 at 5:26 PM, Angel Ice <> wrote:

> Hi everybody.
> I hope it's the right place for questions, if not sorry.
> I'm trying to index rich documents (PDF, MS docs etc) in SolR/Lucene.
> I have seen a few examples explaining how to use tika to solve this. But
> most of these examples are using curl to send documents to Solr or an HTML
> POST with an input file.
> But i'd like to do it in full java.
> Is there a way to use Solrj to index the documents with the
> ExtractingRequestHandler of SolR or at least to get the extracted xml back
> (with the extract.only option) ?
> Many thanks.
> Laurent.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message