lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anatharaman, Srinatha (Contractor)" <Srinatha_Ananthara...@comcast.com>
Subject DataImportHandler - Unable to load Tika Config Processing Document # 1
Date Mon, 06 Feb 2017 22:45:19 GMT
Hi,

I am having below error while trying to index using dataImporthandler

Data-Config file is mentioned below. zookeeper is not able to read "tikaConfig.xml" on below
statement

  processor="TikaEntityProcessor" tikaConfig="tikaConfig.xml"

Please help me to resolve this issue

ion: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to load Tika Config Processing Document # 1
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to load Tika Config Processing Document # 1
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
        ... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to load Tika
Config Processing Document # 1
        at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
        at org.apache.solr.handler.dataimport.TikaEntityProcessor.firstInit(TikaEntityProcessor.java:96)
        at org.apache.solr.handler.dataimport.EntityProcessorBase.init(EntityProcessorBase.java:60)
        at org.apache.solr.handler.dataimport.TikaEntityProcessor.init(TikaEntityProcessor.java:76)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:75)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:433)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
        ... 5 more
Caused by: org.apache.solr.common.cloud.ZooKeeperException: ZkSolrResourceLoader does not
support getConfigDir() - likely, what you are trying to do is not supported in ZooKeeper mode
        at org.apache.solr.cloud.ZkSolrResourceLoader.getConfigDir(ZkSolrResourceLoader.java:149)
        at org.apache.solr.handler.dataimport.TikaEntityProcessor.firstInit(TikaEntityProcessor.java:91)
        ... 11 more


<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
    <dataSource  name="bin" type="BinFileDataSource" />
        <document>
            <entity name="f" dataSource="fileSource" rootEntity="false"
            processor="FileListEntityProcessor"
            baseDir="/app/home/source/"
            fileName=".*\.(com)|(txt)|(docx)"
            onError="skip"
            recursive="true">
                <field column="fileAbsolutePath" name="path" />
                <field column="fileSize" name="size" />
                <field column="fileLastModified" name="lastModified" />
                <field column="link" name="link"/>

                <entity
                    name="documentImport" dataSource="bin"
                    processor="TikaEntityProcessor" tikaConfig="tikaConfig.xml"
                    url="${f.fileAbsolutePath}"
                    format="text">
                    <field column="file" name="fileName"/>
                    <field column="content" name="content"/>
                    <field column="Author" name="author" meta="true"/>
                    <field column="title" name="title" meta="true"/>
                    <field column="text" name="text"/>

                </entity>
        </entity>
        </document>
</dataConfig>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message