lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From scorpking <lehoank1...@gmail.com>
Subject indexing data from rich documents - Tika with solr3.1
Date Fri, 09 Sep 2011 10:58:30 GMT
Hi everyone, 
Now i have had a problem with tika and solr. I successed in index data from
various file formats (pdf, doc...) with a file absolute path. but now I have
a link from internet (ex: http://myweb/filename.pdf). I want to index from
this link, But it's not ok. I don't why? This is my file dataconfig.xml:

*<dataConfig>
    <dataSource type="BinFileDataSource" name="bin"/>
    <document>
						
        <entity name="tika-test" processor="TikaEntityProcessor" url="
http://myweb/filename.pdf" format="text" dataSource="bin" >
				
                <field column="Author" name="author" meta="true"/>
                <field column="title" name="title" meta="true"/>
                <field column="text" name="text"/>

		</entity>
    </document>
</dataConfig>*

when i change url=" http://myweb/filename.pdf" by a file absolute path, it
work very good. 
Any one know this? 
Thanks for your help.

--
View this message in context: http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3322555.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message