lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: DataImportHandler OutOfMemory Mysql
Date Sun, 02 Apr 2017 13:53:21 GMT
On 4/1/2017 4:17 PM, marotosg wrote:
> I am trying to load a big table into Solr using DataImportHandler and Mysql. 
> I am getting OutOfMemory error because Solr is trying to load the full
> table. I have been reading different posts and tried batchSize="-1". 
> Do you have any idea what could be the issue?
> Completely lost here.
> Solr.6.4.1
> mysql-connector-java-5.1.41-bin.jar
> data-config 
> <dataSource type="JdbcDataSource" 
>             driver="com.mysql.jdbc.Driver"
>             url="jdbc:mysql://" 
>             user="suer" 
>             password="passowrd"/>
> <document>
>   <entity name="jobsearch"  
>     pk="id"
> 	batchSize="-1"

Setting batchSize to -1 is the proper solution, but you've got it in the
wrong place.  It goes on dataSource, not on entity.

When batchSize is -1, DIH executes setFetchSize(Integer.MIN_VALUE) on
the JDBC statement.  This causes the MySQL JDBC driver to stream the
results instead of buffering them.

You should upgrade to 6.4.2 or 6.5.0.  6.4.0 and 6.4.1 have a serious
performance bug.

You may also want to edit the maxMergeCount setting on the
mergeScheduler config, set it to at least 6.  I ran into a problem with
the database disconnecting while importing millions of rows with DIH
from MySQL; this was the solution.  See this thread:


View raw message