lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dyer, James" <James.D...@ingramcontent.com>
Subject RE: [SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files
Date Wed, 07 Nov 2012 18:43:31 GMT
Try specifying the "escape" parameter.  This is the character your file uses to escape delimiters
occuring in the data.  If this fixes your problem and if you do not want to have to specify
"escape", you can alter the patch, line 113:

change: (escape == '\\')
to: (escape !=null && escape == '\\')

If this works for you, please upload the modified patch to the JIRA issue for future reference.
 Thanks.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: zakaria benzidalmal [mailto:zakibenz@gmail.com] 
Sent: Wednesday, November 07, 2012 9:53 AM
To: solr-user@lucene.apache.org
Subject: Re: [SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files

James,

Thank you for you quick answer.
This is my data-config.xml file:

<dataConfig>
    <dataSource name="dfs" type="FileDataSource"/>
    <document>
        <entity name="sourcefile"
                processor="FileListEntityProcessor"
                fileName="rocinter.csv"
                rootEntity="false"
                baseDir="/user/work/solr/example/example-DIH/solr/csv/in"
        >

            <entity name="entryline"
                    processor="LineEntityProcessor"
                    url="${sourcefile.fileAbsolutePath}"
                    rootEntity="true"
                    dataSource="fds"
                    separator=","
            >
            </entity>
        </entity>
    </document>
</dataConfig>

am I doing something wrong.

Cordialement.
______________________
Zakaria BENZIDALMAL
mobile: 06 31 40 04 33


2012/11/7 Dyer, James <James.Dyer@ingramcontent.com>

> Zakaria,
>
> You might want to post your data-config.xml, or at least the part that
> uses SOLR-2549.  If its throwing an NPE, it certaintly has a bug (if you're
> doing something wrong, it would at least give you a sensible error
> message).  Also, unless you need to use DIH for some other reason, you
> might want to consider the csv request handler to do your imports, which is
> a mature feature of Solr for importing whole documents from delimited (not
> just csv) files.  See http://wiki.apache.org/solr/UpdateCSV
>
> Here is an example that loads a fixed-width file using DIH and SOLR-2549
> (actually it uses code that SOLR-2549 was based on.  I haven't tried this
> with the exact code in SOLR-2549):
>
> <dataConfig>
>         <dataSource name="URL"
> baseUrl="${dataimporter.request.fileBasepath}" type="URLDataSource" />
>         <document name="FixedWidthCounts">
>                 <entity
>                         name="Counts"
>
> processor="org.apache.solr.handler.dataimport.LineEntityProcessor"
>                         dataSource="URL"
>                         url="incoming/COUNTS.txt"
>                         colDef1="ID,0,9,BIGDECIMAL,0,LEFT"
>                         colDef2="COUNT,9,19,INTEGER,0,LEFT"
>                 />
>         </document>
> </dataConfig>
>
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: zakaria benzidalmal [mailto:zakibenz@gmail.com]
> Sent: Wednesday, November 07, 2012 9:08 AM
> To: solr-user@lucene.apache.org
> Subject: [SOLR-2549] DIH LineEntityProcessor support for delimited &
> fixed-width files
>
> Hi all,
>
> Could some one provide a clear exemple using this Processor
> (data-config.xml exemple)?
>
> I run into this problem after patching and building my code:
>
> GRAVE: Full Import failed:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
>         at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
> Caused by: java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
>         ... 3 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:542)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
>         ... 5 more
> Caused by: java.lang.NullPointerException
>         at
>
> org.apache.solr.handler.dataimport.LineEntityProcessor.initDelimitedOrFixedWidth(LineEntityProcessor.java:142)
>         at
>
> org.apache.solr.handler.dataimport.LineEntityProcessor.init(LineEntityProcessor.java:115)
>         at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:74)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:430)
>         at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
>         ... 6 more
>
> Regards.
>
> zakibenz.
>
>


Mime
View raw message