gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alfonso Nishikawa (JIRA)" <j...@apache.org>
Subject [jira] [Created] (GORA-391) Arrays persisted in HBase don't shrink automatically
Date Mon, 27 Oct 2014 20:39:34 GMT
Alfonso Nishikawa created GORA-391:

             Summary: Arrays persisted in HBase don't shrink automatically
                 Key: GORA-391
                 URL: https://issues.apache.org/jira/browse/GORA-391
             Project: Apache Gora
          Issue Type: Bug
          Components: gora-hbase
    Affects Versions: 0.5, 0.4
            Reporter: Alfonso Nishikawa
            Assignee: Alfonso Nishikawa
            Priority: Minor

Fields defined as arrays can grow and be updated, but don't shrink when an element is deleted.

See the code involved: [https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L312]

The workaround is:
# Define the field as a nullable array: ['null', ...array...]
# Set the field to null and persist  -> the array will be deleted
# Set the field to the new array and persist -> the array will be persisted with the new

Comment from Renato:
bq.You are right, the array can not be shrinked at the moment and yes, it is wrong having
to write the whole array back if you just want to change a single element. The column qualifier
used for each item is the original index that means if your original array had 10 elements,
then you'd have 10 column qualfiers to store those 10 items. But if then you delete the third
element, Gora will end up with 9 actual elements (without the third), but there will be a
10th element inside HBase :( and when modifying a specific element, we will end up rewriting
all of the elements :( Maybe we should do the same thing, we do with the maps and rewrite
them all into HBase. At least it will work correctly.

Maybe the best solution would be an adaptative persistency: if a big percentage of the field
is persisted, overwrite everything. If a small percentage of the field is persisted, update
in a diff maner (addings, deletions, updates). This proposed approach seems too much complex,
so the solution to implement is the one found in maps: delete all elements and write them

{panel:bgColor=#FFFFCE} (!) This will be horrible with arrays with big elements and only one
update, but it is the same as it is being done by now. Same for maps. {panel}

This message was sent by Atlassian JIRA

View raw message