lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-7306) Use radix sort for points too
Date Thu, 26 May 2016 17:02:12 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrien Grand updated LUCENE-7306:
---------------------------------
    Attachment: LUCENE-7903.patch

Here is a simple patch that uses radix sorting on the last dimension (which is convenient
since the bytes for the dimension and for the doc id are contiguous).

I used IndexAndSearchOpenStreetMaps to benchmark. The indexing time went from 344s to 327s
(-5%). Here are the first 30 logs for merging points in both cases:

Master
{code}
SM 0 [2016-05-26T16:28:35.224Z; Thread-0]: 2414 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:39.390Z; Thread-0]: 1899 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:43.443Z; Thread-0]: 1869 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:47.426Z; Thread-0]: 1812 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:51.444Z; Thread-0]: 1850 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:55.422Z; Thread-0]: 1819 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:28:59.409Z; Thread-0]: 1823 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:03.368Z; Thread-0]: 1817 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:07.296Z; Thread-0]: 1802 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:11.205Z; Thread-0]: 1793 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:34.980Z; Thread-0]: 23722 msec to merge points [10963000 docs]
SM 0 [2016-05-26T16:29:38.934Z; Thread-0]: 1798 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:42.844Z; Thread-0]: 1779 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:46.849Z; Thread-0]: 1797 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:50.866Z; Thread-0]: 1802 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:54.917Z; Thread-0]: 1820 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:29:58.965Z; Thread-0]: 1823 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:02.889Z; Thread-0]: 1783 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:06.815Z; Thread-0]: 1785 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:10.835Z; Thread-0]: 1876 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:14.759Z; Thread-0]: 1790 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:37.886Z; Thread-0]: 23085 msec to merge points [10963000 docs]
SM 0 [2016-05-26T16:30:41.777Z; Thread-0]: 1783 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:45.837Z; Thread-0]: 1783 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:49.731Z; Thread-0]: 1785 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:53.624Z; Thread-0]: 1776 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:30:57.536Z; Thread-0]: 1782 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:31:01.512Z; Thread-0]: 1787 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:31:05.477Z; Thread-0]: 1786 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:31:09.889Z; Thread-0]: 1770 msec to merge points [1096300 docs]
{code}

Patch
{code}
SM 0 [2016-05-26T16:20:21.241Z; Thread-0]: 2405 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:25.072Z; Thread-0]: 1583 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:28.834Z; Thread-0]: 1537 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:32.546Z; Thread-0]: 1489 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:36.426Z; Thread-0]: 1524 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:40.263Z; Thread-0]: 1519 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:44.123Z; Thread-0]: 1511 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:48.013Z; Thread-0]: 1506 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:51.807Z; Thread-0]: 1486 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:20:55.882Z; Thread-0]: 1479 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:17.042Z; Thread-0]: 21106 msec to merge points [10963000 docs]
SM 0 [2016-05-26T16:21:20.872Z; Thread-0]: 1517 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:24.629Z; Thread-0]: 1467 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:28.408Z; Thread-0]: 1479 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:32.219Z; Thread-0]: 1485 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:36.108Z; Thread-0]: 1501 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:39.982Z; Thread-0]: 1504 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:44.836Z; Thread-0]: 1502 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:48.717Z; Thread-0]: 1499 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:52.548Z; Thread-0]: 1503 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:21:56.436Z; Thread-0]: 1514 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:17.361Z; Thread-0]: 20883 msec to merge points [10963000 docs]
SM 0 [2016-05-26T16:22:21.197Z; Thread-0]: 1515 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:25.077Z; Thread-0]: 1513 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:28.864Z; Thread-0]: 1504 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:32.675Z; Thread-0]: 1494 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:36.503Z; Thread-0]: 1500 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:40.337Z; Thread-0]: 1516 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:44.281Z; Thread-0]: 1496 msec to merge points [1096300 docs]
SM 0 [2016-05-26T16:22:48.721Z; Thread-0]: 1503 msec to merge points [1096300 docs]
{code}

> Use radix sort for points too
> -----------------------------
>
>                 Key: LUCENE-7306
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7306
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7903.patch
>
>
> Like postings, points make heavy use of sorting at indexing time, so we should try to
leverage radix sort too?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message