carbondata-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jack...@apache.org
Subject [carbondata] branch master updated: [CARBONDATA-3894] [IUD]decrease the size of tableupdatestaus file by remove the invalid segments not exist in tablestatus
Date Sun, 12 Jul 2020 17:07:44 GMT
This is an automated email from the ASF dual-hosted git repository.

jackylk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
     new 063d9b2  [CARBONDATA-3894] [IUD]decrease the size of tableupdatestaus file by remove
the invalid segments not exist in tablestatus
063d9b2 is described below

commit 063d9b2aff86f66f22ce75bc6905affc8a4bd8df
Author: Zhangshunyu <zhangshunyu1990@126.com>
AuthorDate: Thu Jul 9 11:23:39 2020 +0800

    [CARBONDATA-3894] [IUD]decrease the size of tableupdatestaus file by remove the invalid
segments not exist in tablestatus
    
    Why is this PR needed?
    tableupdatestatus file always keep the segments info even the compacted segment is deleted
already,this will lead to the file size increase quickly, which is bad for performance.
    After this change, the tableupdatestatus file size can descrease from ~MB to ~KB
    
    What changes were proposed in this PR?
    Remove the invalid segments
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #3833
---
 .../apache/carbondata/core/mutate/CarbonUpdateUtil.java  | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
index e915c66..77ebf3e 100644
--- a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
@@ -148,7 +148,21 @@ public class CarbonUpdateUtil {
           mergeSegmentUpdate(isCompaction, oldList, newBlockEntry);
         }
 
-        segmentUpdateStatusManager.writeLoadDetailsIntoFile(oldList, updateStatusFileIdentifier);
+        List<SegmentUpdateDetails> updateDetailsValidSeg = new ArrayList<>();
+        Set<String> loadDetailsSet = new HashSet<>();
+        for (LoadMetadataDetails details : segmentUpdateStatusManager.getLoadMetadataDetails())
{
+          loadDetailsSet.add(details.getLoadName());
+        }
+        for (SegmentUpdateDetails updateDetails : oldList) {
+          if (loadDetailsSet.contains(updateDetails.getSegmentName())) {
+            // we should only keep the update info of segments in table status, especially
after
+            // compaction and clean files some compacted segments will be removed. It can
keep
+            // tableupdatestatus file in small size which is good for performance.
+            updateDetailsValidSeg.add(updateDetails);
+          }
+        }
+        segmentUpdateStatusManager
+            .writeLoadDetailsIntoFile(updateDetailsValidSeg, updateStatusFileIdentifier);
         status = true;
       } else {
         LOGGER.error("Not able to acquire the segment update lock.");


Mime
View raw message