lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Connor" <>
Subject fastest way to load documents
Date Fri, 01 Aug 2008 19:36:13 GMT
I have a number of documents in files

1.xml   <add><doc><fields....></doc></add>
2.xml   <add><doc><fields....></doc></add>
17M.xml   <add><doc><fields....></doc></add>

I have been using cat to join them all together:

cat 1.xml 2.xml ... 1000.xml  | grep -v '<\/add><add>' > /tmp/post.xml

and posting them with curl:

curl -d @/tmp/post.xml 'http://localhost:8983/solr/update' -H
'Content-Type: text/xml'

Is there a faster way to load up these documents into a number of solr
shards? I seem to be able to cover 3000/second just catting them
together (2500 at a time is the sweet spot for me) - but this slows
down to under 100/s once I try to do the post with curl.


Ian Connor

View raw message