lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Smiley, David W." <dsmi...@mitre.org>
Subject RE: concurrent csv loading
Date Thu, 06 Aug 2009 21:41:39 GMT
You should stand to benefit from concurrent loading.  Certainly the text analysis would end
up being done concurrently; I'm not sure what else benefits from it but I think there are
other things.  Ideally you could try a configurable number of concurrent loads and pick the
one that gets the job done fastest.

~ David Smiley
________________________________________
From: Joe Calderon [calderon.joe@gmail.com]
Sent: Thursday, August 06, 2009 4:58 PM
To: solr-user@lucene.apache.org
Subject: concurrent csv loading

for first time loads i currently post to
/update/csv?commit=false&separator=%09&escape=\&stream.file=workfile.txt&map=NULL:&keepEmpty=false",
this works well and finishes in about 20 minutes for my work load.

this is mostly cpu bound, i have an 8 core box and it seems one takes
the brunt of the work.

 if i wanted to optimize, would i see any benefit to splitting
workfile.txt in two and doing two posts ?

im running lucid's build of solr 1.3.0 on jetty 6, io is not a
bottleneck as the data folder is on tmpfs

thx much
--joe
Mime
View raw message