Skip to content

yijan4845/map

Repository files navigation

Size-Balanced Tree Map

A C++17 ordered associative container implemented with a Size-Balanced Tree (SBT) that is mostly compatible with std::map, but additionally supports:

  • $O(\log n)$ random access by index (get the k-th element)
  • $O(\log n)$ rank queries (get the index of an iterator)
  • Iterator arithmetic: it + k, it - k, it1 - it2, and order comparison

The core idea is to replace the usual red–black tree used in std::map with a Size-Balanced Tree that maintains the size of each subtree. This extra information allows us to support order-statistics operations without changing the asymptotic complexity of standard map operations.

The main implementation lives in (filenames may vary depending on your layout):

  • map.hProject::map<Key, T> interface (map-like container)
  • _SB_tree.h – Size-Balanced Tree implementation
  • tree.cpp – tree operations and internal helpers
  • unit_test.cpp / test_*.cpp – Google Test based unit tests

Motivation and Background

In C++, std::set and std::map are widely used ordered containers implemented as balanced binary search trees (typically red–black trees). They provide O(log n) search, insertion, and deletion, but do not offer random access iterators: you cannot efficiently

  • jump to the k-th element, or
  • compute the distance between two iterators.

Logically this makes sense – maps are not “indexed sequences” – but in practice they are often used as sorted lists with keys, and operations like “go to the k-th element” or “compute rank” become very desirable.

To explore this gap, this project implements a map based on a Size-Balanced Tree, a form of self-balancing BST that maintains the size of each subtree and uses rotations to keep the height O(log n). With subtree sizes available, we can:

  • find the k-th element by walking the tree and comparing k to left subtree sizes, and
  • compute the rank of an element by walking up to the root and accumulating sizes of left subtrees along the path.

Both operations run in O(log n) time.

The Size-Balanced Tree used here follows the structure proposed by Qifeng Chen (2007). It is similar to a weighted-balanced tree but uses slightly different balancing conditions. Height and operation complexities are proven to be O(log n).


Features

Map interface

The container is implemented as:

  • Project::map<Key, T, Compare, Alloc>

and aims to be mostly compatible with std::map:

  • begin(), end()
  • find(const key_type&)
  • insert(const value_type&)
  • erase(iterator)
  • erase(const key_type&)
  • erase(iterator first, iterator last)
  • clear()
  • size(), empty()
  • operator[](const key_type&)
  • count(const key_type&)
  • lower_bound(const key_type&)
  • upper_bound(const key_type&)

Extended iterator operations

Compared to std::map, iterators support additional operations:

  • iterator& operator+=(size_t k) – move forward by k elements
  • iterator& operator-=(size_t k) – move backward by k elements
  • iterator operator+(size_t k) const – return iterator k steps ahead
  • iterator operator-(size_t k) const – return iterator k steps behind
  • size_t operator-(const iterator& other) const – distance between two iterators
  • bool operator<(const iterator& other) const – compare relative order

These are implemented using subtree sizes in the underlying SBT and run in O(log n) time instead of repeatedly calling ++ / --.

Order-statistics operations

With subtree sizes maintained at each node, we can:

  • Random access by index: find the k-th element (0-based or 1-based depending on convention) in O(log n) time.
  • Rank queries: given an iterator, compute its rank (position in the sorted order) in O(log n) time by walking up the tree and summing left-subtree sizes.

Usage

A minimal example:

#include "map.h"
#include <iostream>

int main() {
    Project::map<int, std::string> m;

    m[1] = "one";
    m[3] = "three";
    m[2] = "two";

    // Standard map-like operations
    auto it = m.find(2);
    if (it != m.end()) {
        std::cout << "key = " << it->first
                  << ", value = " << it->second << "\n";
    }

    // Random access: move iterator by k positions
    auto kth = m.begin();
    kth += 2; // move 2 steps forward
    std::cout << "3rd element key = " << kth->first << "\n";

    // Distance between two iterators (in O(log n))
    auto first = m.begin();
    auto last  = m.end();
    --last;
    std::size_t dist = last - first;
    std::cout << "distance(first, last) = " << dist << "\n";

    return 0;
}

Building and running tests

The full project can be built with CMake and tested with Google Test.

Requirements:

  • C++17 compiler
  • CMake
  • Google Test (GTest)

Typical build & test flow:

mkdir build && cd build
cmake ..
make

# Run the main test binary
./main

unit_test.cpp, test_map.cpp, and test_tree.cpp (or similarly named files) contain test cases that compare this implementation against std::map to verify correctness.


Experimental Results

We evaluated the implementation on:

  • Luogu P6136 – “普通平衡树(加强版)”, a standard balanced BST benchmark problem.
  • Additional randomized tests using Google Test, cross-checking all operations against std::map.

Results:

  • The SBT-based map passes P6136 and behaves correctly on stress tests.
  • In our experiments, basic operations are typically around 1.2–2× slower than std::map:
    • each node stores an extra size field,
    • balancing conditions and rotations are slightly more complex.

In exchange, we obtain efficient O(log n) random access and rank queries, which are not available in the standard std::map implementation.


Implementation Notes (Design Highlights)

Some implementation details, inspired by stl_tree.h (GNU libstdc++) and CS106L-style container implementations:

  • The underlying SBT node (e.g. _Sb_tree_node_base) stores:
    • subtree size
    • parent, left, and right pointers
  • The header node tracks:
    • root, leftmost, and rightmost nodes
    • special invariants (e.g., header’s size = root size + 1) to simplify edge-case handling for begin() / end() / increment / decrement.
  • Allocation is done via a rebind of the user-provided Allocator to the internal node type.
  • Node creation uses forwarding constructors (template <typename... Args>) and handles exceptions by destroying and deallocating on failure.
  • Iterator types are provided in both non-const and const variants, with the appropriate conversions between them (carefully using const_cast where needed, similar to _Rb_tree in libstdc++).

This design follows the general spirit of the internal _Rb_tree used by std::map, but:

  • replaces the red–black balancing with Size-Balanced Tree rotations based on subtree sizes, and
  • adds order-statistics operations on top.

Limitations and Future Work

Compared to a red–black-tree-based std::map, this implementation has:

  • Higher time overhead

    • Most operations are slightly slower because subtree sizes must be updated on every insert/erase/rotation.
  • Higher space overhead

    • Each node stores an additional integer for subtree size.
  • Partial STL feature coverage

    • Some advanced functions such as emplace are not implemented.
    • Only unique keys are supported; inserting duplicate keys via insert is undefined behavior (no multimap semantics).
  • Testing could be expanded

    • Existing tests cover many scenarios, but more large-scale random tests and fuzzing would further increase confidence.

Potential future extensions:

  • emplace / try_emplace support
  • More STL-compatible utilities and range operations
  • Additional benchmarks and stress tests
  • Integration with C++20 ranges

Acknowledgements

This project is inspired by:

  • Qifeng Chen’s Size-Balanced Tree (2007), which maintains balance using subtree sizes and rotations.
  • The GNU libstdc++ rb_tree implementation (stl_tree.h) for overall container and iterator design.
  • Teaching materials such as Stanford CS106L for map/set implementation style and API design.

About

C++ Assignment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published