commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (COLLECTIONS-714) PatriciaTrie ignores trailing null characters in keys
Date Mon, 04 Nov 2019 06:38:00 GMT

     [ https://issues.apache.org/jira/browse/COLLECTIONS-714?focusedWorklogId=337987&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-337987
]

ASF GitHub Bot logged work on COLLECTIONS-714:
----------------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Nov/19 06:37
            Start Date: 04/Nov/19 06:37
    Worklog Time Spent: 10m 
      Work Description: dota17 commented on pull request #99: [COLLECTIONS-714] Throw exception
when put \u0000 to trie
URL: https://github.com/apache/commons-collections/pull/99#discussion_r341914871
 
 

 ##########
 File path: src/main/java/org/apache/commons/collections4/trie/analyzer/StringKeyAnalyzer.java
 ##########
 @@ -82,6 +82,9 @@ public int bitIndex(final String key, final int offsetInBits, final int
lengthIn
                 k = 0;
             } else {
                 k = key.charAt(index1);
+                if (k == 0) {
+                    throw new IllegalArgumentException("Don't support '\\u0000' in the key.");
+                }
 
 Review comment:
   '\\u0000' is shown as \u0000, '\u0000' is shown as blank space.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 337987)
    Time Spent: 20m  (was: 10m)

> PatriciaTrie ignores trailing null characters in keys
> -----------------------------------------------------
>
>                 Key: COLLECTIONS-714
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-714
>             Project: Commons Collections
>          Issue Type: Bug
>          Components: Collection, Map
>    Affects Versions: 4.3
>            Reporter: Rohan Padhye
>            Priority: Critical
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In Java, strings are not null terminated. The string "x" (of length = 1 char) is different
from the string "x\u0000" (of length = 2 chars). However, PatriciaTrie does not seem to distinguish
between these strings.
> To reproduce: 
> {code:java}
> public void testNullTerminatedKey1() {
>     Map<String, Integer> map = new HashMap<>();
>     map.put("x", 0);         // key of length 1
>     map.put("x\u0000", 1);   // key of length 2
>     map.put("x\u0000y", 2);  // key of length 3
>     Assert.assertEquals(3, map.size());  // ok, 3 distinct keys
>     PatriciaTrie<Integer> trie = new PatriciaTrie<>(map);
>     Assert.assertEquals(3, trie.size());  // fail; actual=2
> }{code}
> In the above example, the resulting trie has only two keys: "x\u0000" and "x\u0000y".
The key "x" gets overwritten. Here is another way to repro the bug: 
> {code:java}
> public void testNullTerminatedKey2() {
>     PatriciaTrie<Integer> trie = new PatriciaTrie<>();
>     trie.put("x", 0);
>     Assert.assertTrue(trie.containsKey("x")); // ok
>     trie.put("x\u0000", 1);
>     Assert.assertTrue(trie.containsKey("x")); // fail
> }
> {code}
> In the above example, the key "x" suddenly disappears when an entry with key "x\u0000"
is inserted.
> The PatriciaTrie docs do not mention anything about null-terminated strings. In general,
I believe this also breaks the JDK Map contract since the keys "x".equals("x\u0000") is false. 
> This bug was found automatically using JQF: [https://github.com/rohanpadhye/jqf].
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message