lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roy Liu <liuchua...@gmail.com>
Subject Re: How to index PDF file stored in SQL Server 2008
Date Mon, 11 Apr 2011 06:02:06 GMT
Hi, all
Thank YOU very much for your kindly help.

*1. I have upgrade from Solr 1.4 to Solr 3.1*
*2. Change data-config-sql.xml *

<dataConfig>
  <dataSource type="JdbcDataSource"
              name="*bsds*"
              driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"

url="jdbc:sqlserver://localhost:1433;databaseName=bs_docmanager"
              user="username"
              password="pw"/>
  <datasource name="*docds*" type="*BinURLDataSource*" />

  <document name="docs">
    <entity name="*doc*" dataSource="*bsds*"
            query="select id,attachment,filename from attachment where
ext='pdf' and id>30001030" >
            <field column="id" name="id" />
            *<entity dataSource="docds" processor="TikaEntityProcessor"
url="${doc.attachment}" format="text" >**
                <field column="attachment" name="bs_attachment" />
            </entity>*
            <field column="filename" name="title" />
    </entity>
  </document>
</dataConfig>

*3. solrconfig.xml and schema.xml are NOT changed.*

However, when I access
*http://localhost:8080/solr/dataimport?command=full-import*

It still has errors:
Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query:[B@ae1393 Processing Document # 1

Could you give me some advices. This problem is so boring me.
Thanks.

-- 
Best Regards,
Roy Liu


On Mon, Apr 11, 2011 at 5:16 AM, Lance Norskog <goksron@gmail.com> wrote:

> You have to upgrade completely to the Apache Solr 3.1 release. It is
> worth the effort. You cannot copy any jars between Solr releases.
> Also, you cannot copy over jars from newer Tika releases.
>
> On Fri, Apr 8, 2011 at 10:47 AM, Darx Oman <darxoman@gmail.com> wrote:
> > Hi again
> > what you are missing is field mapping
> > <field column="id" name="id" />
> > ....
> >
> >
> > no need for TikaEntityProcessor  since you are not accessing pdf files
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message