lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From olivier sallou <olivier.sal...@gmail.com>
Subject Need help on Solr Cell usage with specific Tika parser
Date Mon, 14 Jun 2010 16:14:04 GMT
Hi,
I use Solr Cell to send specific content files. I developped a dedicated
Parser for specific mime types.
However I cannot get Solr accepting my new mime types.

In solrconfig, in update/extract requesthandler I specified <str
name="tika.config">./tika-config.xml</str> , where tika-config.xml is in
conf directory (same as solrconfig).

In tika-config I added my mimetypes:

<parser name="parse-readseq"
class="org.irisa.genouest.tools.readseq.ReadSeqParser">
                <mime>biosequence/document</mime>
                <mime>biosequence/embl</mime>
                <mime>biosequence/genbank</mime>
        </parser>

I do not know for:
  <mimeTypeRepository resource="./tika-mimetypes.xml" magic="false"/>

whereas path to tika mimetypes should be absolute or relative... and even if
this file needs to be redefined if "magic" is not used.


When I run my update/extract, I have an error that "biosequence/document"
does not match any known parser.

Thanks

Olivier

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message