trafficserver-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a..@apache.org
Subject [trafficserver] branch master updated: IntrusiveHashMap: Inserts preserve order for equal keys, as with std::multimap. Update documentation, more testing.
Date Mon, 10 Sep 2018 16:24:31 GMT
This is an automated email from the ASF dual-hosted git repository.

amc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/trafficserver.git


The following commit(s) were added to refs/heads/master by this push:
     new 5a9fd0a  IntrusiveHashMap: Inserts preserve order for equal keys, as with std::multimap.
Update documentation, more testing.
5a9fd0a is described below

commit 5a9fd0a026101efd59dbdd95d3796787a0c13823
Author: Alan M. Carroll <amc@apache.org>
AuthorDate: Sat Sep 8 20:18:34 2018 -0500

    IntrusiveHashMap: Inserts preserve order for equal keys, as with std::multimap.
    Update documentation, more testing.
---
 .../internal-libraries/intrusive-hash-map.en.rst   | 28 +++++++--
 lib/ts/IntrusiveHashMap.h                          | 29 +++++----
 lib/ts/unit-tests/test_IntrusiveHashMap.cc         | 73 +++++++++++++++++++++-
 3 files changed, 113 insertions(+), 17 deletions(-)

diff --git a/doc/developer-guide/internal-libraries/intrusive-hash-map.en.rst b/doc/developer-guide/internal-libraries/intrusive-hash-map.en.rst
index 6ee1413..832f68e 100644
--- a/doc/developer-guide/internal-libraries/intrusive-hash-map.en.rst
+++ b/doc/developer-guide/internal-libraries/intrusive-hash-map.en.rst
@@ -20,7 +20,7 @@
 IntrusiveHashMap
 ****************
 
-:class:`IntrusiveHashMap` provides a "hash" or "unordered" set, using intrusive links. It
provides a
+:class:`IntrusiveHashMap` provides a "hashed" or "unordered" set, using intrusive links.
It provides a
 container for elements, each of which has a :arg:`key`. A hash function is applied to a key
to
 generate a :arg:`hash id` which is used to group the elements in to buckets for fast lookup.
This
 container is a mix of :code:`std::unordered_set` and :code:`std::unordered_map`. There is
no
@@ -29,7 +29,8 @@ separation between elements and keys, but each element can contain non-key
data.
 Iteration over elements is provided and is constant time.
 
 In order to optimize lookup, the container can increase the number of buckets used. This
is called
-the "expansion policy" of the container and it can be automatic or controlled externally.
+"expansion" and when it occurs is control by the "expansion policy" of the container. The
policy can
+be to be automatic or only done explicitly.
 
 Usage
 *****
@@ -96,9 +97,18 @@ Details
 
       An STL compliant iterator over elements in the container.
 
+   .. type:: range
+
+      An STL compliant half open range of elements, represented by a pair of iterators.
+
    .. function:: IntrusiveHashMap & insert(value_type * v)
 
-      Insert the element :arg:`v` into the container.
+      Insert the element :arg:`v` into the container. If there are already elements with
the same
+      key, :arg:`v` is inserted after these elements.
+
+      There is no :code:`emplace` because :arg:`v` is put in the container, not a copy of
:arg:`v`.
+      For the same reason :arg:`v` must be constructed before calling this method, the container
+      will never create an element instance.
 
    .. function:: iterator begin()
 
@@ -113,6 +123,13 @@ Details
       Search for :arg:`v` in the container. If found, return an iterator refering to :arg:`v`.
If not
       return the end iterator. This validates :arg:`v` is in the container.
 
+   .. function:: range equal_range(key_type key)
+
+      Find all elements with a key that is equal to :arg:`key`. The returned value is a half
open
+      range starting with the first matching element to one past the last matching element.
All
+      element in the range will have a key that is equal to :arg:`key`. If no element has
a matching
+      key the range will be empty.
+
    .. function:: IntrusiveHashMap & erase(iterator spot)
 
       Remove the element referred to by :arg:`spot` from the container.
@@ -149,10 +166,11 @@ Design Notes
 This is a refresh of an previously existing class, :code:`TSHahTable`. The switch to C++
11 and then
 C++ 17 made it possible to do much better in terms of the internal implementation and API.
The
 overall functionality is the roughly the same but with an easier API, compatiblity with
-:class:`IntrusiveDList`, and better internal implementation.
+:class:`IntrusiveDList`, improved compliance with STL container standards, and better internal
+implementation.
 
 The biggest change is that elements are stored in a single global list rather than per hash
bucket.
-The buckets server only as entry points in to the global list and to count the number of
elements
+The buckets serve only as entry points in to the global list and to count the number of elements
 per bucket. This simplifies the implementation of iteration, so that the old :code:`Location`
nested
 class can be removed. Elements with equal keys can be handled in the same way as with STL
 containers, via iterator ranges, instead of a custom psuedo-iterator class.
diff --git a/lib/ts/IntrusiveHashMap.h b/lib/ts/IntrusiveHashMap.h
index 3f1848a..14951ca 100644
--- a/lib/ts/IntrusiveHashMap.h
+++ b/lib/ts/IntrusiveHashMap.h
@@ -520,6 +520,7 @@ IntrusiveHashMap<H>::insert(value_type *v)
   auto key         = H::key_of(v);
   Bucket *bucket   = this->bucket_for(key);
   value_type *spot = bucket->_v;
+  bool mixed_p     = false; // Found a different key in the bucket.
 
   if (nullptr == spot) { // currently empty bucket, set it and add to active list.
     _list.append(v);
@@ -528,22 +529,28 @@ IntrusiveHashMap<H>::insert(value_type *v)
   } else {
     value_type *limit = bucket->limit();
 
+    // First search the bucket to see if the key is already in it.
     while (spot != limit && !H::equal(key, H::key_of(spot))) {
       spot = H::next_ptr(spot);
     }
-
-    if (spot == limit) { // this key is not in the bucket, add it at the end and note this
is now a mixed bucket.
-      _list.insert_before(bucket->_v, v);
-      bucket->_v       = v;
-      bucket->_mixed_p = true;
-    } else { // insert before the first matching key.
-      _list.insert_before(spot, v);
-      if (spot == bucket->_v) { // added before the bucket start, update the start.
-        bucket->_v = v;
-      } else { // if the matching key wasn't first, there is some other key in the bucket,
mark it mixed.
-        bucket->_mixed_p = true;
+    if (spot != bucket->_v) {
+      mixed_p = true; // found some other key, it's going to be mixed.
+    }
+    if (spot != limit) {
+      // If an equal key was found, walk past those to insert at the upper end of the range.
+      do {
+        spot = H::next_ptr(spot);
+      } while (spot != limit && H::equal(key, H::key_of(spot)));
+      if (spot != limit) { // something not equal past last equivalent, it's going to be
mixed.
+        mixed_p = true;
       }
     }
+
+    _list.insert_before(spot, v);
+    if (spot == bucket->_v) { // added before the bucket start, update the start.
+      bucket->_v = v;
+    }
+    bucket->_mixed_p = mixed_p;
   }
   ++bucket->_count;
 
diff --git a/lib/ts/unit-tests/test_IntrusiveHashMap.cc b/lib/ts/unit-tests/test_IntrusiveHashMap.cc
index abcbd35..050adb9 100644
--- a/lib/ts/unit-tests/test_IntrusiveHashMap.cc
+++ b/lib/ts/unit-tests/test_IntrusiveHashMap.cc
@@ -25,9 +25,9 @@
 #include <string_view>
 #include <string>
 #include <bitset>
+#include <random>
 #include <ts/IntrusiveHashMap.h>
 #include <ts/BufferWriter.h>
-#include <catch.hpp>
 #include "../../../tests/include/catch.hpp"
 
 // -------------
@@ -122,6 +122,7 @@ TEST_CASE("IntrusiveHashMap", "[libts][IntrusiveHashMap]")
   auto r = map.equal_range("dup"sv);
   REQUIRE(r.first != r.second);
   REQUIRE(r.first->_payload == "dup"sv);
+  REQUIRE(r.first->_n == 79);
 
   Map::iterator idx;
 
@@ -134,10 +135,13 @@ TEST_CASE("IntrusiveHashMap", "[libts][IntrusiveHashMap]")
   REQUIRE(r.first != r.second);
   idx = r.first;
   REQUIRE(idx->_payload == "dup"sv);
+  REQUIRE(idx->_n == 79);
   REQUIRE((++idx)->_payload == "dup"sv);
   REQUIRE(idx->_n != r.first->_n);
+  REQUIRE(idx->_n == 80);
   REQUIRE((++idx)->_payload == "dup"sv);
   REQUIRE(idx->_n != r.first->_n);
+  REQUIRE(idx->_n == 81);
   REQUIRE(++idx == map.end());
   // Verify only the "dup" items are left.
   for (auto &&elt : map) {
@@ -146,3 +150,70 @@ TEST_CASE("IntrusiveHashMap", "[libts][IntrusiveHashMap]")
   // clean up the last bits.
   map.apply([](Thing *thing) { delete thing; });
 };
+
+// Some more involved tests.
+TEST_CASE("IntrusiveHashMapManyStrings", "[IntrusiveHashMap]")
+{
+  std::vector<std::string_view> strings;
+
+  std::uniform_int_distribution<short> char_gen{'a', 'z'};
+  std::uniform_int_distribution<short> length_gen{20, 40};
+  std::minstd_rand randu;
+  constexpr int N = 1009;
+
+  Map ihm;
+
+  strings.reserve(N);
+  for (int i = 0; i < N; ++i) {
+    auto len = length_gen(randu);
+    char *s  = static_cast<char *>(malloc(len + 1));
+    for (decltype(len) j = 0; j < len; ++j) {
+      s[j] = char_gen(randu);
+    }
+    s[len] = 0;
+    strings.push_back({s, size_t(len)});
+  }
+
+  // Fill the IntrusiveHashMap
+  for (int i = 0; i < N; ++i) {
+    ihm.insert(new Thing{strings[i], i});
+  }
+
+  REQUIRE(ihm.count() == N);
+
+  // Do some lookups - just require the whole loop, don't artificially inflate the test count.
+  bool miss_p = false;
+  for (int j = 0, idx = 17; j < N; ++j, idx = (idx + 17) % N) {
+    if (auto spot = ihm.find(strings[idx]); spot == ihm.end() || spot->_n != idx) {
+      miss_p = true;
+    }
+  }
+  REQUIRE(miss_p == false);
+
+  // Let's try some duplicates when there's a lot of data in the map.
+  miss_p = false;
+  for (int idx = 23; idx < N; idx += 23) {
+    ihm.insert(new Thing(strings[idx], 2000 + idx));
+  }
+  for (int idx = 23; idx < N; idx += 23) {
+    auto spot = ihm.find(strings[idx]);
+    if (spot == ihm.end() || spot->_n != idx || ++spot == ihm.end() || spot->_n !=
2000 + idx) {
+      miss_p = true;
+    }
+  }
+  REQUIRE(miss_p == false);
+
+  // Do a different stepping, special cases the intersection with the previous stepping.
+  miss_p = false;
+  for (int idx = 31; idx < N; idx += 31) {
+    ihm.insert(new Thing(strings[idx], 3000 + idx));
+  }
+  for (int idx = 31; idx < N; idx += 31) {
+    auto spot = ihm.find(strings[idx]);
+    if (spot == ihm.end() || spot->_n != idx || ++spot == ihm.end() || (idx != (23 * 31)
&& spot->_n != 3000 + idx) ||
+        (idx == (23 * 31) && spot->_n != 2000 + idx)) {
+      miss_p = true;
+    }
+  }
+  REQUIRE(miss_p == false);
+};


Mime
View raw message