lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Any way to improve document fetching performance?
Date Tue, 28 Aug 2018 18:31:28 GMT
bq. It seems store field is not perform well.

Stored fields perform exactly as intended. Consider the situation
where very large text fields are stored. Making those into something
like docValues would be a very poor tradeoff, even if it were
possible. Not to mention highlighting etc. There are circumstances
where fetching from docValues actually has poorer overall performance
than using stored=true.

That said, the ability to use docValues fields in place of stored
(subject to certain restrictions that you should take the time to
understand) does indeed blur the distinction.

It's really a matter of choosing the use-case that supports the
use-case you require, there's no one-size-fits-all way to go about it.

Encoding/decoding in your own binary format? Will you be able to use
those values for things like faceting, grouping and sorting (which is
what docValues were designed to enhabnce)?
On Tue, Aug 28, 2018 at 2:11 AM alex stark <> wrote:
> I simple tried MultiDocValues.getBinaryValues to fetch result by doc value, it improves
a lot, 2000 result takes only 5 ms. I even encode all the returnable fields to binary docvalues
and then decode them, the results is also good enough. It seems store field is not perform
well.... In our scenario (I think it is more common nowadays), search phrase should return
as many results as possible so that rank phrase can resort the results by machine learning
algorithm(on other clusters). Fetching performance is also important. ---- On Tue, 28 Aug
2018 00:11:40 +0800 Erick Erickson <> wrote ---- Don't use that
call. You're exactly right, it goes out to disk, reads the doc, decompresses it (16K blocks
minimum per doc IIUC) all just to get the field. 2,000 in 50ms actually isn't bad for all
that work ;). This sounds like an XY problem. You're asking how to speed up fetching docs,
but not telling us anything about _why_ you want to do this. Fetching 2,000 docs is not generally
what Solr was built for, it's built for returning the top N where N is usually < 100, most
frequently < 20. If you want to return lots of documents' data you should seriously look
at putting the fields you want in docValues=true fields and pulling from there. The entire
Streaming functionality is built on this and is quite fast. Best, Erick On Mon, Aug 27, 2018
at 7:35 AM <> wrote: > > can you post your query string?
> > Best > > > On 8/27/18 10:33 AM, alex stark wrote: > > In same machine,
no net latency. When I reduce to 500 limit, it takes 20ms, which is also slower than I expected.
btw, indexing is stopped. ---- On Mon, 27 Aug 2018 22:17:41 +0800 <>
wrote ---- yes, it should be less than a ms actually for those type of files. index and search
on the same machine? no net latency in between? Best On 8/27/18 10:14 AM, alex stark wrote:
> quite small, just serveral simple short text store fields. The total index size is around
1 GB (2m doc). ---- On Mon, 27 Aug 2018 22:12:07 +0800 <> wrote
---- Alex,- how big are those docs? Best regards On 8/27/18 10:09 AM, alex stark wrote: >
Hello experts, I am wondering is there any way to improve document fetching performance, it
appears to me that visiting from store field is quite slow. I simply tested to use indexsearch.doc()
to get 2000 document which takes 50ms. Is there any idea to improve that? ---------------------------------------------------------------------
To unsubscribe, e-mail: For additional commands, e-mail: ---------------------------------------------------------------------
To unsubscribe, e-mail: For additional commands, e-mail: > > > > ---------------------------------------------------------------------
> To unsubscribe, e-mail: > For additional commands,
e-mail: > ---------------------------------------------------------------------
To unsubscribe, e-mail: For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message