lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jorg Heymans <jorg.heym...@gmail.com>
Subject Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)
Date Tue, 08 Dec 2009 12:02:38 GMT
Hi,

I am looking into using Solr for indexing a large database that has
documents (mostly pdf and msoffice) stored as CLOBs in several tables.
It is my understanding that the DIH as provided in Solr 1.4 cannot
index these CLOBs yet, and that SOLR-1358 should provide exactly this.
So i was wondering what the most 'recommended' way is of solving this
.. Should it be done with a custom textextractor of some sort, set on
the column/field ?

Thanks,
Jorg

Mime
View raw message