lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lucene/Solr QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8623) Decrease I/O pressure when merging high dimensional points
Date Tue, 08 Jan 2019 18:27:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737374#comment-16737374
] 

Lucene/Solr QA commented on LUCENE-8623:
----------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 34s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 28s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 28s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green}  0m 28s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green}  0m
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green}
 0m 28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 25s{color} | {color:green}
core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 37s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-8623 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12954126/LUCENE-8623.patch
|
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  validatesourcepatterns
 |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon Sep 24 17:14:57
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
|
| git revision | master / a37e2c6 |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
|  Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/149/testReport/ |
| modules | C: lucene/core U: lucene/core |
| Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/149/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Decrease I/O pressure when merging high dimensional points
> ----------------------------------------------------------
>
>                 Key: LUCENE-8623
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8623
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ignacio Vera
>            Priority: Major
>         Attachments: Geo3D.png, LUCENE-8623.patch, LUCENE-8623.patch, LUCENE-8623.patch,
LUCENE-8623.patch, LatLonPoint.png, LatLonShape.png
>
>
> Related with LUCENE-8619, after indexing 60 million shapes(~1.65 billion triangles)
using {{LatLonShape}}, the index directory grew to a size of 265 GB when performing merging
of different segments. After the processes were over the index size was 57 GB.
> As an example imagine we are merging several segments to a new segment of size 10GB (4
dimensions). The BKD tree merging logic will create the following files:
> 1) Level 0: 4 copies of the data, each one sorted by one dimensions : 40GB
> 2) Level 1: 6 copies of half of the data, left and right : 30GB
> 3) Level 2: 6 copies of one quarter of the data, left and right : 15 GB
> 4) Level 3: 6 more copies halving the previous level, left and right : 7.5 GB
> 5) Level 4: 6 more copies halving the previous level, left and right : 3.75 GB
>  
> and so on... So it requires around 100GB to merge that segment. 
> In this issue is proposed to delay the creation of sorted copies to when they are needed.
It reduces the total size required to half of what it is needed now. 
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message