lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <>
Subject Re: Does DocValues improve Grouping performance ?
Date Sat, 31 Jan 2015 19:47:13 GMT

Please check two questions inlined below

On Sat, Jan 31, 2015 at 10:14 PM, Michael Sokolov <> wrote:

> We were using grouping (no DocValues, though) and recently switched to
> using block-indexing and joins (see
> confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers).
> We got a nice speedup on average (perhaps 2x faster) and an even better
> improvement in the worst times; overall the performance is much more
> predictable and better, and I suspect (haven't checked) that we may be
> using less heap too.  The block indexing is cutting edge, a little
> complicated to get right, and I had to make some custom java code to get
> things just the way I wanted, but for best performance it does seem to be
> the way to go.
> Beware some gotchas:
> You have to reindex all the docs that participate in the parent-child
> relation so that each parent-child block gets indexed at once.  This might
> cause difficulties, but for us and I suspect most people, it's the natural
> thing to do anyway.
> You can only handle a single relation this way since you have to
> restructure your index to use it; grouping is more flexible.
would you mind to comment which relations you need to model particularly?
BJQ is definitely much restrictive than grouping, but still have some
flexibility to cover the most frequent demands.

> Clients may not support the new block-indexing syntax (I think SolrJ has
> it, but the python client we were using did not);
> Converting an existing index requires special care; you basically have to
> delete all documents you are re-indexing
> The Solr query parsers don't support scoring the joined-from documents
> (child docs in the to-parent query, parent docs in the to-child query).
> This might not matter to you, but it was important for our use case
Would you mind to leave your vote it's not a big deal to

> So there are some kinks still, but if you can make it work for you, it
> does seem to perform better than grouping.
> -Mike
> On 1/30/2015 4:10 PM, Cario, Elaine wrote:
>> Hi Shamik,
>> We use DocValues for grouping, and although I have nothing to compare it
>> to (we started with DocValues), we are also seeing similar poor results as
>> you: easily 60% overhead compared to non-group queries.  Looking around for
>> some solution, no quick fix is presenting itself unfortunately.
>> CollapsingQParserPlugin also is too limited for our needs.
>> -----Original Message-----
>> From: Shamik Bandopadhyay []
>> Sent: Thursday, January 15, 2015 6:02 PM
>> To:
>> Subject: Does DocValues improve Grouping performance ?
>> Hi,
>>     Does use of DocValues provide any performance improvement for
>> Grouping ?
>> I' looked into the blog which mentions improving Grouping performance
>> through DocValues.
>> Right now, Group by queries (which I can't sadly avoid) has become a huge
>> bottleneck. It has an overhead of 60-70% compared to the same query san
>> group by. Unfortunately, I'm not able to be CollapsingQParserPlugin as it
>> doesn't have a support similar to "group.facet" feature.
>> My understanding on DocValues is that it's intended for faceting and
>> sorting. Just wondering if anyone have tried DocValues for Grouping and saw
>> any improvements ?
>> -Thanks,
>> Shamik

Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message