lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8675) Divide Segment Search Amongst Multiple Threads
Date Thu, 31 Jan 2019 18:03:00 GMT


Adrien Grand commented on LUCENE-8675:

The best way to address such issues is on top of Lucene by having multiple shards whose results
can be merged with TopDocs#merge.

Parallelizing based on ranges of doc IDs is problematic for some queries, for instance the
cost of evaluating a range query over an entire segment or only about a specific range of
doc IDs is exactly the same given that it uses data-structures that are organized by value
rather than by doc ID.

> Divide Segment Search Amongst Multiple Threads
> ----------------------------------------------
>                 Key: LUCENE-8675
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Atri Sharma
>            Priority: Major
> Segment search is a single threaded operation today, which can be a bottleneck for large
analytical queries which index a lot of data and have complex queries which touch multiple
segments (imagine a composite query with range query and filters on top). This ticket is for
discussing the idea of splitting a single segment into multiple threads based on mutually
exclusive document ID ranges.
> This will be a two phase effort, the first phase targeting queries returning all matching
documents (collectors not terminating early). The second phase patch will introduce staged
execution and will build on top of this patch.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message