lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Slower indexing speed in Solr 8.0.0
Date Wed, 03 Apr 2019 04:40:53 GMT
I'm using external zookeeper, running on Solr Cloud with one shards and two
replicas. This is a testing setup, so there is only one machine.
The input data is coming from CSV file. I am indexing one CSV file at a
time, and each CSV file contains 3 million records.
I'm indexing using the code from the SimplePostTools.

I have already tried it more than 10 times, and for all the time that I
tried, the indexing speed in 8.0 are all at least 40% slower than 7.7.1

Regards,
Edwin




On Wed, 3 Apr 2019 at 11:19, Aroop Ganguly <aroopganguly@icloud.com> wrote:

> Indexing speeds are function of a lot of variables in my experience.
>
> What is your setup like?
> What kind of cluster you have, the number of shards you created, the
> number of machines etc?
> Where is your input data coming from? What technology do you use to
> indexing (simple java threads or something more robust like flink/spark)?
> How many documents do you index at a time?
>
> How many times have u run the indexer job on the new 8.0 setup before
> concluding its slower?
> Make a matrix of all these variables and test over at least 5 runs before
> making an opinion.
>
> I’d love hear more
>
> > On Apr 2, 2019, at 7:41 PM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> wrote:
> >
> > For additional info, I am still using the same version of the major
> > components like ZooKeeper, Tika, Carrot2 and Jetty.
> >
> > Regards,
> > Edwin
> >
> > On Wed, 3 Apr 2019 at 10:17, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I am setting up the latest Solr 8.0.0, and I am re-indexing the data
> from
> >> scratch in Solr 8.0.0
> >>
> >> However, I found that the indexing speed is slower in Solr 8.0.0, as
> >> compared to the earlier version like Solr 7.7.1. I have not changed the
> >> schema.xml and solrconfig.xml yet, just did a change of the
> >> luceneMatchVersion in solrconfig.xml to 8.0.0
> >> uceneMatchVersion>8.0.0</luceneMatchVersion>
> >>
> >> On average, the speed is about 40% to 50% slower. For example, the
> >> indexing speed was about 17 mins in Solr 7.7.1, but now it takes about
> 25
> >> mins to index the same set of data.
> >>
> >> What could be the reason that causes the indexing to be slower in Solr
> >> 8.0.0?
> >>
> >> Regards,
> >> Edwin
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message