lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Issue Indexing zip file content in Solr 1.4
Date Sat, 21 Nov 2009 11:07:21 GMT

On Nov 21, 2009, at 4:06 AM, Kerwin wrote:

> Hi,
> 
> Has anyone faced this issue? If yes why is Tika 0.4 bundled with solr 1.4
> .. Instead it should be Tika 0.5...

0.5 was released after Solr 1.4.  See https://issues.apache.org/jira/browse/SOLR-1567

> 
> Problem:
> I have a zip file with multiple files of different formats in it.
> I am trying to index the zip file content with Solr 1.4 but the Autodetect
> parser context is not being passed with the current 1.4 distribution of the
> extractingDocumentLoader.So I am unable to index zip file content since an
> Empty parser is being created. After indexing the file only the package
> entries are displayed as content.
> I replaced Tika 0.4 that come with the solr 1.4 distribution with Tika 0.5
> along wih some other POI jars and this seems to work as the context is now
> being passed and the delegate parser is able to deletate to the correct
> parser.
> 
> In Tika 0.4 the Autodetect parser does not create the context but in Tika
> 0.5 it creates the context before calling the parse method.
> 
> Am I missing something? Please advise.

Sounds like we just need to upgrade.  What you did is perfectly reasonable.
Mime
View raw message