lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <shalinman...@gmail.com>
Subject Re: DIH: Setting rows= on full-import has no effect
Date Fri, 09 Oct 2009 09:57:48 GMT
FYI - This is fixed in trunk.

2009/10/9 Noble Paul നോബിള്‍ नोब्ळ् <noble.paul@corp.aol.com>

> I have raised an issue http://issues.apache.org/jira/browse/SOLR-1501
>
> On Fri, Oct 9, 2009 at 6:10 AM, Jay Hill <jayallenhill@gmail.com> wrote:
> > In the past setting rows=n with the full-import command has stopped the
> DIH
> > importing at the number I passed in, but now this doesn't seem to be
> > working. Here is the command I'm using:
> > curl '
> >
> http://localhost:8983/solr/indexer/mediawiki?command=full-import&rows=100'
> >
> > But when 100 docs are imported the process keeps running. Here's the log
> > output:
> >
> > Oct 8, 2009 5:23:32 PM org.apache.solr.handler.dataimport.DocBuilder
> > buildDocument
> > INFO: Indexing stopped at docCount = 100
> > Oct 8, 2009 5:23:33 PM org.apache.solr.handler.dataimport.DocBuilder
> > buildDocument
> > INFO: Indexing stopped at docCount = 200
> > Oct 8, 2009 5:23:35 PM org.apache.solr.handler.dataimport.DocBuilder
> > buildDocument
> > INFO: Indexing stopped at docCount = 300
> > Oct 8, 2009 5:23:36 PM org.apache.solr.handler.dataimport.DocBuilder
> > buildDocument
> > INFO: Indexing stopped at docCount = 400
> > Oct 8, 2009 5:23:38 PM org.apache.solr.handler.dataimport.DocBuilder
> > buildDocument
> > INFO: Indexing stopped at docCount = 500
> >
> > and so on.
> >
> > Running on the most recent nightly: 1.4-dev 823366M - jayhill -
> 2009-10-08
> > 17:31:22
> >
> > I've used that exact url in the past and the indexing stopped at the rows
> > number as expected, but I haven't run the command for about two months on
> a
> > build from back in early July.
> >
> > Here's the dih config:
> >
> >  <dataConfig>
> >    <dataSource
> >       name="dsFiles"
> >       type="FileDataSource"
> >       encoding="UTF-8"/>
> >    <document>
> >      <entity
> >     name="f"
> >     processor="FileListEntityProcessor"
> >     baseDir="/path/to/files"
> >     fileName=".*xml"
> >     recursive="true"
> >     rootEntity="false"
> >     dataSource="null">
> >
> >    <entity
> >       name="wikixml"
> >       processor="XPathEntityProcessor"
> >       forEach="/mediawiki/page"
> >       url="${f.fileAbsolutePath}"
> >       dataSource="dsFiles"
> >       onError="skip"
> >       >
> >      <field column="id" xpath="/mediawiki/page/id"/>
> >      <field column="title" xpath="/mediawiki/page/title"/>
> >      <field column="contributor"
> > xpath="/mediawiki/page/revision/contributor/username"/>
> >      <field column="comment" xpath="/mediawiki/page/revision/comment"/>
> >      <field column="text" xpath="/mediawiki/page/revision/text"/>
> >
> >        </entity>
> >      </entity>
> >    </document>
> > </dataConfig>
> >
> >
> > -Jay
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
>



-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message