lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: HELP SolrJ: performance issue when adding index to SolrServer
Date Sun, 04 Jan 2015 19:02:57 GMT
On 1/4/2015 2:55 AM, zhangjianad@dcits.com wrote:
> 	SolrCloud 4.10.2:
> 		3 solr instances in tomcat 7,
> 		3 zookeepers to manage solr config set,
> 		solr index data is stored on HDFS
> 
> 	I have a web app, need  to build solr index when one user adds a new
> post or comment the post.
> 
> 	I use solrJ to add indexes to to SolrServer, below are my sample code
> and time consumed,
> 	how to reduce the running time of below code? any tip to improve the
> performance? any solution for this case? 200 ms is acceptable for us.
> 
> 
> 	//below codes take about 500 ms
> 	CloudSolrServer server = new CloudSolrServer(zkHost);
> 	server.setDefaultCollection(defaultCollection);
> 
> 	//below codes take about 500 ms
> 	server.connect();

Since creating the object and connecting to ZK should only be required
*once* during the lifetime of the application ... it doesn't really
matter how long this takes.  One second of time during application
initialization is *nothing*.  SolrServer objects are completely
thread-safe ... you do not need to create a new one every time your
application loops.

> 	//below codes take about 1500 ms
> 	server.add(doc);

Unless that Solr document is *enormous*, there's something really wrong
if adding a single document takes a second and a half.

This is where we get into very specific questions about your Solr
installation, trying to track down where the performance problem is.  On
a single Solr server, how many total documents are in all the replicas
that live on that machine, much disk space is taken by all the "index"
directories, how large is the Java heap, and how much total RAM does the
machine have?  Is there software other than Solr on the machine?  Have
you tuned your garbage collection on the JVM at all?

> 	//I use autoSoftCommit=1000ms,since commit in code will take about 30
> minutes
> 	//server.commit();

Do you actually *need* a document to be searchable within one second
after it is indexed?  Once the situation is examined deeply, this is
rarely a genuine requirement.  It is usually something that
sales/marketing lists as a requirement ... but in almost all real-world
situations, nobody ever notices if it takes a minute or two for new
stuff to become visible.

Unless the index is really tiny or you have taken special steps to
ensure it happens faster, a commit operation that opens a new searcher
will normally take a few seconds ... so you won't get that one second
visibility anyway.

Just like I mentioned for an "add" taking 1.5 seconds, if a commit takes
30 minutes, there's something wrong.

Normally these kinds of performance issues happen when there is not
enough memory on the machine.  They can be caused by other things,
that's just the most common problem.

Thanks,
Shawn


Mime
View raw message