lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: questions regrading stored fields role in query time
Date Tue, 26 Feb 2019 17:09:53 GMT
On 2/26/2019 1:34 AM, Saurabh Sharma wrote:
> Now we want to do partial updates.I went through the documentation and
> found that all the fields should be stored or docValues for partial
> updates. I have few questions regarding this?
> 
> 1) In case i am just fetching only 1 field while making query.What will the
> performance impact due to all fields being stored? Lets say i have an "id"
> field and i do have doc value true for the field, will solr use stored
> fields in this case? will it load whole document in RAM ?

I am not aware of any option to keep docValues in RAM.  If you have 
enough memory in your system (memory that has NOT been assigned to any 
program), then the OS *might* keep some or all of your index data in 
memory.  That functionality, present in all modern operating systems, is 
the secret to good performance.

The stored data is compressed.  The docValues data is not compressed. 
Uncompressing stored data uses CPU cycles.  Generally if data must be 
read off of disk, compressed will be faster.  But if the data has been 
cached by the OS and comes from memory, which you definitely want to 
happen if possible, uncompressed will likely be faster ... and it will 
definitely require less CPU.

If you have many fields but you're only fetching one, then docValues 
will almost certainly be faster than stored.  All of the stored fields 
for one document are compressed together, so Solr will be reading data 
that it won't actually be using, in order to achieve decompression.

I believe that if you have both stored data and docValues for a field, 
Solr will use stored data for search results.  I am not positive that 
this is the case, but I think it's what happens.

> 2)What's the impact of large stored fields (.fdt) on query time
> performance. Do query time even depend on the stored field or they just
> depend on indexes?

The size of your stored data will have no *DIRECT* impact on query 
performance.  Stored data is not consulted for the query part.  It is 
consulted when document data is retrieved to return with the response.

A large amount of stored data can have an indirect impact on query 
performance.  If there is insufficient memory available to the OS disk 
cache, then reading the stored data to return results to the client will 
push information out of the disk cache that is needed for queries.  If 
that happens, then Solr will need to re-read that data off the disk to 
do a query.  Because disks are glacially slow compared to memory, 
performance will be impacted.

Here's a page about performance problems.  Most of it is about memory, 
since that is usually the resource that has the biggest effect on 
performance:

https://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Mime
View raw message