lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "De Simone, Alessandro" <Alessandro.DeSim...@bvdinfo.com>
Subject RE: search time & number of segments
Date Tue, 20 May 2014 13:04:24 GMT
Hi again!
 
> Using the calculator, I must admit that it is puzzling that you have
2432  / 143 = 17.001 times the amount of seeks with 16 segments.

Do you have any clue? Is there something I could test?

> Is the total file size of the optimized index about the same as the segmented one?

Yes it's about the same: now it's 25,6GB for the optimized one, and 26,1GB (17 segments) for
the un-optimized one.

> While I find the number of seeks to be an interesting problem, I wonder why you don't
just solve the performance problem by throwing hardware at it?

I don’t have the budget to change the hardware and it would be difficult for me to justify
replacing a working hardware just to handle the same amount of data :-(
Anyway, I certainly would have noticed a performance hit sooner or later if I had a SSD.

Thanks

Alessandro De Simone


-----Original Message-----
From: Toke Eskildsen [mailto:te@statsbiblioteket.dk] 
Sent: lundi 19 mai 2014 16:43
To: java-user@lucene.apache.org
Subject: Re: search time & number of segments

On Mon, 2014-05-19 at 11:54 +0200, De Simone, Alessandro wrote:

[24GB index, 8GB disk cache, only indexed fields]

> The "IO calls" I was referring to is the number of time the 
> "BufferedIndexInput.refill()" function is called. So it means that we 
> have 16 times more bytes read when there are 16 segments for the exact 
> same result.

Using the calculator, I must admit that it is puzzling that you have
2432  / 143 = 17.001 times the amount of seeks with 16 segments. I would have expected that
number to be smaller than 16, due to pure chance of data being in the same blocks in some
segments.

Is the total file size of the optimized index about the same as the segmented one?

Toke:
> > I am guessing that you are using spinning drives and that there is not much RAM
in the machine? 
> 
> As you can see we have a lot of RAM.

Not if you're using spinning drives and have no stored fields.
http://wiki.apache.org/solr/SolrPerformanceProblems

While I find the number of seeks to be an interesting problem, I wonder why you don't just
solve the performance problem by throwing hardware at it? Consumer SSDs are dirt cheap nowadays
and even the enterprise ones are not that pricey. Same goes for RAM as long as we're talking
about a relative small amount such as 32GB.

- Toke Eskildsen, State and University Library, Denmark



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message