A C++17 ordered associative container implemented with a
Size-Balanced Tree (SBT) that is mostly compatible with std::map,
but additionally supports:
-
$O(\log n)$ random access by index (get the k-th element) -
$O(\log n)$ rank queries (get the index of an iterator) - Iterator arithmetic:
it + k,it - k,it1 - it2, and order comparison
The core idea is to replace the usual red–black tree used in std::map
with a Size-Balanced Tree that maintains the size of each subtree.
This extra information allows us to support order-statistics operations
without changing the asymptotic complexity of standard map operations.
The main implementation lives in (filenames may vary depending on your layout):
map.h–Project::map<Key, T>interface (map-like container)_SB_tree.h– Size-Balanced Tree implementationtree.cpp– tree operations and internal helpersunit_test.cpp/test_*.cpp– Google Test based unit tests
In C++, std::set and std::map are widely used ordered containers
implemented as balanced binary search trees (typically red–black trees).
They provide O(log n) search, insertion, and deletion, but do not
offer random access iterators: you cannot efficiently
- jump to the k-th element, or
- compute the distance between two iterators.
Logically this makes sense – maps are not “indexed sequences” – but in practice they are often used as sorted lists with keys, and operations like “go to the k-th element” or “compute rank” become very desirable.
To explore this gap, this project implements a map based on a
Size-Balanced Tree, a form of self-balancing BST that maintains the
size of each subtree and uses rotations to keep the height
O(log n). With subtree sizes available, we can:
- find the k-th element by walking the tree and comparing
kto left subtree sizes, and - compute the rank of an element by walking up to the root and accumulating sizes of left subtrees along the path.
Both operations run in O(log n) time.
The Size-Balanced Tree used here follows the structure proposed by
Qifeng Chen (2007). It is similar to a weighted-balanced tree but uses
slightly different balancing conditions. Height and operation
complexities are proven to be O(log n).
The container is implemented as:
Project::map<Key, T, Compare, Alloc>
and aims to be mostly compatible with std::map:
begin(),end()find(const key_type&)insert(const value_type&)erase(iterator)erase(const key_type&)erase(iterator first, iterator last)clear()size(),empty()operator[](const key_type&)count(const key_type&)lower_bound(const key_type&)upper_bound(const key_type&)
Compared to std::map, iterators support additional operations:
iterator& operator+=(size_t k)– move forward bykelementsiterator& operator-=(size_t k)– move backward bykelementsiterator operator+(size_t k) const– return iteratorksteps aheaditerator operator-(size_t k) const– return iteratorksteps behindsize_t operator-(const iterator& other) const– distance between two iteratorsbool operator<(const iterator& other) const– compare relative order
These are implemented using subtree sizes in the underlying SBT and run
in O(log n) time instead of repeatedly calling ++ / --.
With subtree sizes maintained at each node, we can:
- Random access by index: find the k-th element (0-based or 1-based
depending on convention) in
O(log n)time. - Rank queries: given an iterator, compute its rank (position in the
sorted order) in
O(log n)time by walking up the tree and summing left-subtree sizes.
A minimal example:
#include "map.h"
#include <iostream>
int main() {
Project::map<int, std::string> m;
m[1] = "one";
m[3] = "three";
m[2] = "two";
// Standard map-like operations
auto it = m.find(2);
if (it != m.end()) {
std::cout << "key = " << it->first
<< ", value = " << it->second << "\n";
}
// Random access: move iterator by k positions
auto kth = m.begin();
kth += 2; // move 2 steps forward
std::cout << "3rd element key = " << kth->first << "\n";
// Distance between two iterators (in O(log n))
auto first = m.begin();
auto last = m.end();
--last;
std::size_t dist = last - first;
std::cout << "distance(first, last) = " << dist << "\n";
return 0;
}The full project can be built with CMake and tested with Google Test.
Requirements:
- C++17 compiler
- CMake
- Google Test (GTest)
Typical build & test flow:
mkdir build && cd build
cmake ..
make
# Run the main test binary
./mainunit_test.cpp, test_map.cpp, and test_tree.cpp (or similarly named
files) contain test cases that compare this implementation against
std::map to verify correctness.
We evaluated the implementation on:
- Luogu P6136 – “普通平衡树(加强版)”, a standard balanced BST benchmark problem.
- Additional randomized tests using Google Test, cross-checking all
operations against
std::map.
Results:
- The SBT-based map passes P6136 and behaves correctly on stress tests.
- In our experiments, basic operations are typically around
1.2–2× slower than
std::map:- each node stores an extra
sizefield, - balancing conditions and rotations are slightly more complex.
- each node stores an extra
In exchange, we obtain efficient O(log n) random access and rank
queries, which are not available in the standard std::map
implementation.
Some implementation details, inspired by stl_tree.h (GNU libstdc++)
and CS106L-style container implementations:
- The underlying SBT node (e.g.
_Sb_tree_node_base) stores:- subtree size
- parent, left, and right pointers
- The header node tracks:
- root, leftmost, and rightmost nodes
- special invariants (e.g., header’s size = root size + 1) to simplify
edge-case handling for
begin()/end()/ increment / decrement.
- Allocation is done via a rebind of the user-provided
Allocatorto the internal node type. - Node creation uses forwarding constructors (
template <typename... Args>) and handles exceptions by destroying and deallocating on failure. - Iterator types are provided in both non-const and const variants, with
the appropriate conversions between them (carefully using
const_castwhere needed, similar to_Rb_treein libstdc++).
This design follows the general spirit of the internal _Rb_tree used
by std::map, but:
- replaces the red–black balancing with Size-Balanced Tree rotations based on subtree sizes, and
- adds order-statistics operations on top.
Compared to a red–black-tree-based std::map, this implementation has:
-
Higher time overhead
- Most operations are slightly slower because subtree sizes must be updated on every insert/erase/rotation.
-
Higher space overhead
- Each node stores an additional integer for subtree size.
-
Partial STL feature coverage
- Some advanced functions such as
emplaceare not implemented. - Only unique keys are supported; inserting duplicate keys via
insertis undefined behavior (nomultimapsemantics).
- Some advanced functions such as
-
Testing could be expanded
- Existing tests cover many scenarios, but more large-scale random tests and fuzzing would further increase confidence.
Potential future extensions:
emplace/try_emplacesupport- More STL-compatible utilities and range operations
- Additional benchmarks and stress tests
- Integration with C++20 ranges
This project is inspired by:
- Qifeng Chen’s Size-Balanced Tree (2007), which maintains balance using subtree sizes and rotations.
- The GNU libstdc++
rb_treeimplementation (stl_tree.h) for overall container and iterator design. - Teaching materials such as Stanford CS106L for map/set implementation style and API design.