bitstrings.tex

\chapter{Bit String Data Structures}\label{c:bit-strings}

In this chapter we present succinct data structures and succinct indices which are related to bit strings.
We start with definitions of \rank{} and \select{} operations on a general bit string and show the succinct indexes for them.

Then we assume a sequence of parentheses which are correctly matched and naturally encoded as bit strings, which we call balanced.
We define the operations \match{} and \enclose{}.
Finally we define the problem of \emph{range minimum queries} and present its solution for (balanced) bit strings.

We leave the realm of bit strings and focus on dictionaries -- a data structure for storing sets with a given number of elements.
There are various types of dictionaries which differ by the operations which they support.
We show a simple implementation of a fully indexable dictionary which we use in the introduction of a compressed array.

\section{Operations on Bit Strings}\label{s:op-bs}

We define the following two operations for a general bit string of size $N$ bits.
We use $\ph$ as a placeholder for a value of a bit -- either $0$ or $1$ (and later opening or closing parenthesis).
\begin{description}
	\item[$\rank_{\ph}(S, i) \rightarrow r$]
	Returns the number $r$ of $\ph$ symbols in the bit string $S$ on positions $[0, i]$.
	For convenience we extend the definition or \rank{} as $0$ for $i < 0$ and $\rank_{\ph}(S, N - 1)$ for $i \ge N$.

	\item[$\select_{\ph}(S, r) \rightarrow j$]
	Returns the position $j$ of the $r$-th symbol $\ph$ in the bit string $S$.
	If $r \le 0$, it returns $-1$; if $r > \rank_{\ph}(S, N - 1)$, it returns $N$.
\end{description}

Using these two operations we are able to derive more operations related to a given position:
\begin{description}
	\item[$\inspect(S, i) = S[i\char93 \rightarrow \{0, 1\}$]
	Returns the symbol at position $i$ in the bit string $S$.
	If for some reason the bit string $S$ was not accessible, the operation can be implemented using \rank{}s:
	$$ \inspect(S, i) = \rank_1(i) - \rank_1(i - 1). $$

	\item[$\pred_{\ph}(S, i) \rightarrow j$]
	Returns the rightmost position $j \le l$ of the symbol \ph{}:
	$$ \pred_{\ph}(S, i) = \select_{\ph}(S, \rank_{\ph}(S, i)). $$
	
	\item[$\prev_{\ph}(S, i) \rightarrow j$]
	Returns the rightmost position $j < i$ of the symbol \ph{}:
	$$ \prev_{\ph}(S, i) = \select_{\ph}(S, \rank_{\ph}(S, i - 1)). $$

	\item[$\succ_{\ph}(S, i) \rightarrow j$]
	Returns the leftmost position $j \ge i$ of the symbol \ph{}:
	$$ \succ_{\ph}(S, i) = \begin{cases}
		i & \textif S[i] = \ph; \\
		\select_{\ph}(S, \rank_{\ph}(S, i) + 1) & \textotherwise.
	\end{cases} $$

	\item[$\nextt_{\ph}(S, i) \rightarrow j$]
	Returns the leftmost position $j > i$ of the symbol \ph{}:
	$$ \nextt_{\ph}(S, i) = \select_{\ph}(S, \rank_{\ph}(S, i) + 1).$$
\end{description}

Unless it is ambiguous, the first argument specifying the bit string on which the operation is performed can be omitted.

\bigbreak

As all other operations can be derived from \rank{} and \select{}, it is sufficient to show succinct indices for these two.
Indices for both operations are well studied.
Many improvements were proposed in order to
\begin{iteminline}
	\item lower the theoretical space complexity,
	\item spare memory in the practical cases,
	\item speed up the real implementation.
\end{iteminline}
For more details we refer to \cite{gonzalez2005practical, kim2005efficient, makinen2007rank}.

\subsection{\rank}\label{ss:rank}

We start our description \rank{} which is conceptually easier.
We show the data in the form in which it was originally described by \cite{jacobson1988succinct}.

It is sufficient to support only \rank$_1$ because of the following identity:
$$ i = \rank_0(i) + \rank_1(i) - 1. $$

\bigbreak

We cover the bit string $S$ by \emph{blocks} and \emph{small blocks}.
An $i$-th block of size $B$ starts at positions $i B$ and ends at position $(i + 1) B - 1$.
The small blocks are defined the same way, except they use the size $b$.
The sizes will be determined later.

First we design the small blocks so that we can precompute a look-up table \rank{} answering rank queries on them.
The query in the look-up table has one parameter: the small-block-local offset $o = i \% b$, and it returns the number of symbols \ph{} until $o$.
This is a very similar problem to the one which we showed in the example in section \ref{ex:precomputation}.

The data supplied to the look-up table are the $b$ bits of the small block and the parameter $o$; the answer is in range $[0, b]$.
The algorithm captured by the look-up table simply iterates through the bits of the small block and adds them together.
We set $b = \frac{\log N}{2}$.
The look-up table is a succinct index since $b + \log b + \log (b + 1) < \log N$; the total size is $O(\sqrt{N} \log N \log \log N) = o(N)$ bits.

The blocks represent parts of the bit string of size $B$, which is assumed to be a multiple of $b$.
Each block $k$ has an array $\block_k$ which for each small block $l$ contains the numbers of \ph{} in the block before the beginning of $l$:
$$ block_k[l] = \sum_{x = k B}^{k B + l b - 1} [S[x] = \ph]. $$
The sizes of all the block arrays sum up to $\frac{N}{B} \frac{B}{b} \log (B + 1)$.
When $B$ is set to $\log^2 N$, the size is asymptotically $O\left(\frac{N}{\log N} \log \log N\right) = o(N)$.
As the arrays $\block_k$ have the same size (with the exception of the last one), we can concatenate them without need of lemma \ref{l:concat}.

We do the same on the level of the whole bit string by storing an array $\glob$:
$$ \glob[k] = \sum_{x = 0}^{k B - 1} [S[x] = \ph]. $$
The array $\glob$ has $\frac{N}{B}$ elements each of size $\log (N+1)$ bits resulting in size asymptotically $O\left(\frac{N}{\log^2 N} \log{N}\right) = O\left(\frac{N}{\log N}\right) = o(N)$.

\subsubsection{Algorithm}

We find out which block $k$ and which small block $l$ inside the block contains the position $i$, then we simply sum the precomputed values in arrays $\glob$, $\block_k$ and the result of the look-up table for the small block:
\begin{algorithm}
\begin{algorithmic}
\Function{\rank}{$S, i$}
	\State $k \gets \frac{i}{B}$%
	\Instr $l \gets \frac{i \% B}{b}$%
	\Instr $s \gets b \frac{i}{b}$
	\State\Return{$\glob[k] + \block_k[l] + \rank[S[s : s + b], i \% b]$}
\EndFunction
\end{algorithmic}
\end{algorithm}

\subsubsection{Implementation Details}

The look-up table could be substituted by an instruction \verb|POPCNT| which is part of Advanced Bit Manipulation instruction set which is provided by many CPU architectures (\cite{intelsys}).
Alternatively an algorithm with a theoretic running time $O(\log \log N)$ can be used, which leads in practice to a negligible slowdown, as the practical implementation still uses a fixed number of instructions.

\subsection{\select}\label{ss:select}

The operation \select{} is more complicated than \rank{} as it handles parts of the bit string differently depending on the local density of \ph{}.
We show the index as it was proposed by \cite{clark1998compact}.

Two instances of the index are also necessary, since there is no identify which would allow a reduction from \select$_1$ to \select$_0$.

\bigbreak

The \select{} operation also covers the bit string with multiple levels of blocks.
The biggest difference from \rank{} is that the sizes of the blocks are not fixed -- they are chosen by the number of symbols \ph{} or their density in the bit string.

We cover the bit string by \emph{blocks} and store offsets of the blocks within the bit string.
If a block is sparse, we store a list of offsets of all symbols \ph{} in it.
Otherwise we process the block with a look-up table.

The bit string is ``covered'' by blocks beginning with \ph{}, each containing $B$ symbols \ph{} with the exception of the last one.
Note that not every symbol of the bit string is covered since there can be sequences of non-\ph{} symbols outside of the blocks; this does not matter as they are not interesting for the queries.
The $\glob$ array contains offset of all blocks.

We call a block \emph{sparse} if its size is greater than or equal to $K$ bits, otherwise it is called \emph{dense}.
The value of $K$ will be determined later.
This property can be easily tested from the difference of two consecutive offsets in $\glob$.

If a block is sparse, we store the block-local offsets of all symbols \ph{} in an array $\blockEnum$.
We will deal with dense blocks later, once we discuss the constraints on $B$ and $K$ and set their values.
For the space density of $\glob$, the following must hold in order to obtain a succinct index:
\begin{gather*}
\frac{\frac{N}{B} \log N}{N} = \frac{\log N}{B} = o(1), \\
B = \omega(\log N).
\end{gather*}

Each offset in the enumeration of a sparse block can be in range $[0, N)$, resulting in up to $\log N$ bits of space.
We have a lower bound on the size of a block ($K$) which allows us to phrase a constraint on the upper bound of the block density.
We need:
\begin{gather*}
\frac{B \log N}{K} = o(1), \\
K = \omega(B \log N).
\end{gather*}

In order to have a strong bound for the dense blocks, we want to set $B$ and $K$ as small as possible.
A sensible choice is to set:
\begin{align*}
	B &= \log N \log \log N, \\
	K &= (\log N \log \log N) \log N \log \log N = B^2.
\end{align*}
There is one more structure which needs to be discussed -- an array $\blocks$ of pointers to the beginnings of representations of blocks.
Nevertheless, its space is the same as of the array $\glob$.

\bigbreak

We follow in a similar way with the index for a dense block.
We cover it by \emph{small blocks}, each containing $b$ symbols \ph{}, with the exception of the last one.
Each dense block has an array $\block$ of block-local offsets of beginnings of small blocks.
A small block is called \emph{sparse} if its size is greater than or equal to $k$ bits, and \emph{dense} otherwise.

If a small block is sparse, we store small-block-local offsets of all symbols \ph{} in an array $\smallBlockEnum$.
An array $\smallBlocks$ pointing to the beginnings of the representations of small blocks exists again.
The lower bound on the size of a dense block is $B$ bits.
The constraint on $b$ from the definition of $\block$ is:
\begin{gather*}
\frac{\frac{B}{b} \log K}{B} = \frac{\log K}{b} = o(1), \\
b = \omega(\log K) = \omega(\log(\log N \log \log N)^2) = \omega(\log \log N).
\end{gather*}

Each offset in the enumeration of a sparse small block can be in range $[0, K)$ resulting in up to $\log K$ bits of space.
The lower bound on the size of a block used for calculating the index density is $k$:
\begin{gather*}
\frac{b \log K}{k} = o(1), \\
k = \omega(b\log K).
\end{gather*}

We choose:
\begin{align*}
b &= \log \log N \log \log \log N, \\
k &= (\log \log N \log \log \log N) \log(\log N \log \log N)^2 \log \log \log N = b^2.
\end{align*}

It remains to discuss the constraint on the array $\smallBlocks$ which contains $\frac{B}{b}$ pointers of size $l$.
Since the representation of sparse small blocks is succinct, it is smaller than $K$ per block.
The size $l$ of the pointer is $\log K$, and therefore the size of the whole array poses the same constraint as the array $\block$.

\bigbreak

We have been neglecting the case of dense small blocks; now we show how it is handled.
Dense small blocks are limited to size $k = o(\log N)$, which makes it possible to process them using a precomputed look-up table.
Only one look-up table is sufficient because all dense small blocks can be padded from the right to the same size, while retaining the same result.
No additional structure  is necessary for small blocks; the pointer in the array $\smallBlocks$ is a dummy value.

\subsubsection{Algorithm}

For a given parameter $r$ we find in which block it is contained by division by $B$ and note the reminder as a block-local index.
In the array $\glob$ we find the offset of such block, and access its representation via the pointer in the array $\blocks$.
If the block is sparse, we index the array $\blockEnum$ of all occurrences to obtain the block-local offset; together with the block offset they form the answer.

If the block is dense, we find in which small block the $r$-th symbol \ph{} lies by division by $b$ and note the small-block-local index.
In the array $\block$ we find the block-local offset of the small block.
If the small block is sparse, we index the array $\smallBlockEnum$ to obtain the small-block-local offset; together with the block offset and the small block offset they form the answer.

If the small block is dense, we use the look-up table to find the small-block-local offset.
The answer is then computed in a similar fashion as in the previous case.
We are guaranteed that there exists a constant $c$ such that for all $N$, the following inequality holds: $k < c \cdot \frac{1}{2} \log N$, which means that we are done from the theoretical point of view.

In practice, when we are interested in all values $N$ and not only those big enough for which $c \le 1$, we need to iterate over chunks of $s = \frac{\log N}{2}$ bits of the small block until the desired position is found.
The look-up table is extended to return $-1$ when the desired symbol does not lie in the current chunk.
Note that the loop of the stated algorithm terminates, because we are guaranteed that the small block contains the desired symbol.

\begin{algorithm}
\begin{algorithmic}
\Function{\select}{$S, r$}
	\State $a \gets \frac{r}{B}$%
	\Instr $r' \gets r \% B$
	\If{$a\ \textrm{is sparse}$}
		\State \Return{$\glob[a] + \blocks[a].\blockEnum[r']$}
	\Else
		\State $a' \gets \frac{r'}{b}$%
		\Instr $r'' \gets r' \% b$
		\State $j \gets \glob[a] + \blocks[a].\block$
		\If{$a'\ \textrm{is sparse}$}
			\State \Return{$j + \blocks[a].\smallBlocks[a'].\smallBlockEnum[r'']$}
		\Else
			\While{$\true$}
				\State $j' \gets \select[S[j : j + s], r'']$
				\If{$j' = -1$}
					\State $r'' \gets r'' - \rank[S[j : j + s], s]$
					\State $j \gets j + s$
				\Else
					\State \Return{$j + j'$}
				\EndIf
			\EndWhile
		\EndIf
	\EndIf
\EndFunction
\end{algorithmic}
\end{algorithm}

\subsubsection{Implementation Details}

We can do the same trick as in \rank{}: we replace the theoretical look-up table with instructions of modern processors, namely \verb|POPCNT| and \verb|PDEP|.
The later instruction is part of Bit Manipulation Instruction Set 2; details of this instructions are described in the Intel manual \cite{intelsys}.
Using these instructions allows us to process dense small blocks by chunks of $s = 64$ bits.

The while loop leads to tens to hundreds of iterations which is the main reason for the bad performance of this algorithm.
Better results can be achieved by using a different indices which despite not being necessarily succinct, tend to be smaller and faster in real situations.
The details can be found in \cite{kim2005efficient}.

% \todo{bounds}

\section{Balanced Bit Strings}

A \emph{balanced bit string} is a bit string containing \emph{opening} and \emph{closing} parentheses (encoded by $1$ and $0$) such that each parenthesis has its matching one in the bit string.
Since we aim for a systematic succinct data structure, we need to design the storage of the data and the indices separately.

The universe of balanced bit string with $n$ opening and closing parentheses has exactly $C_n = \frac{1}{n+1} {2n \choose n}$ elements, where $C_n$ represents the $n$-th \emph{Catalan number}.
The number of bits required for representation of elements in this universe is:
$$ \log C_n \sim \log \frac{4^n}{n \sqrt{\pi n}} = 2n - \log n - \frac{1}{2} \log{\pi n} \sim 2n - O(\log n). $$
Therefore, we can store the balanced bit string using the trivial encoding of parentheses as zeros and ones while retaining the succinctness of the data structure.
We use $N = 2n$ to refer to the size of the bit string.

The operations and indices which we have shown for general bit strings stay the same with the minor difference that they are defined for parentheses instead of bits.
Because of the additional structure, which stems from the balanced property, we define more operations on balanced bit strings:
\begin{description}
	\item[$\findClose(i) \rightarrow j$]
	Returns the position $j$ of the closing parenthesis which is paired with the opening one at position $i$; such parenthesis is guaranteed to exist.
	If the parenthesis at the position $i$ is not an opening one, the result is $i$.
	
	\item[$\findOpen(i) \rightarrow j$]
	If $S[i] = \openingParen$, then $i$ is returned; otherwise it returns the position $j$ such that $\findClose(j) = i$.
	
	\item[$\match(i) \rightarrow j$]
	Returns the position $j$ of the parenthesis which is paired with the one on position $i$.
	This operation is just a convenient wrapper around \findOpen{} and \findClose{} depending on the parenthesis at the position $i$.
	It is defined as:
	$$ \match(i) = 
	\begin{cases}
		\findClose(i) & \textif S[i] = \openingParen; \\
		\findOpen(i) & \textotherwise.
	\end{cases}$$
	
	\item[$\excess(i) \rightarrow d$]
	Returns the difference $d$ between the number of opening and closing parentheses until the position $i$:
	$$ \excess(i) = \rank_((i) - \rank_)(i). $$
	
	\item[$\parenDepth(i) \rightarrow d$]
	The \emph{depth} $d$ of a parentheses pair $(i, \match(i))$ is defined as:
	$$ d = \parenDepth(i) = \excess(\findOpen(i)) - 1 = \excess(i) - [S[i] = \openingParen]. $$
	In order to distinguish the depth of a parentheses pair and the depth of a vertex in a tree (which we will use often later), we call the operation \parenDepth{}.
	
	\item[$\enclose(i) \rightarrow j$]
	Returns the position $j$ of the opening parenthesis which tightly encloses the pair $(i, \match(i))$.
	If the result of \enclose{} is not defined, we set it to $-1$:
	\begin{align*}
		\enclose(i) = -1 \textif \parenDepth(i) = 0.
	\end{align*}
	In all the other cases, the following (in)equalities hold:
	\begin{align*}
		k &= \enclose(i), \\
		\excess(k) &= \excess(\findOpen(i)) - 1, \\
		k &< \findOpen(i), \\
		\match(k) &> \findClose(i).
	\end{align*}
	Sometimes \enclose{} can be generalized to take two parameters:
	
	\item[$\enclose(i_1, i_2) \rightarrow j$]
	Returns the position $j$ of an opening parenthesis which tightly encloses the pairs $(i_1, \match(i_1))$ and $(i_2, \match(i_2))$:
	\begin{align*}
		\excess(j) &< \min(\excess(\findOpen(i_1)), \excess(findOpen(i_2))) \\
		j &< \min(\findOpen(i_1), \findOpen(i_2)) \\
		\match(j) &> \max(\findClose(i_1), \findClose(i_2)).
	\end{align*}
	This operation is indeed a generalization of the one-parameter \enclose{}:
	$$\enclose(i, i) = \enclose(i).$$
	
	The result does not have to exist in a general balanced bit string, however it will not be an issue in our case.
	
	\item[$\rmqi(i_1, i_2) \rightarrow j$]
	\emph{Range Minimum Query} -- Returns the position $i_1 \le j \le i_2$ such that: 
	$$\excess(\rmqi(i_1, i_2)) \ge \excess(k) \ \forall i_1 \le k \le i_2$$
	If there are multiple positions with the same minimum excess, then the leftmost one is returned.
	
	The value of the minimum is $\rmq(i_1, i_2) = \excess(\rmqi(i_1, i_2))$.
	
	\item[$\RMQi(i_1, i_2) \rightarrow j$]
	\emph{Range Maximum Query} -- Returns the position with maximum excess.
\end{description}

We will restrict ourselves to balanced bit strings which contain only one pair of parentheses such that their $\parenDepth$ is equal to zero.
This restriction changes the size of the universe to $C_{n-1}$ which in a negligible difference of $2$ bits.

\bigbreak

It is sufficient to show only the succinct indices for \findClose{}, \enclose{} and \rmqi{} since the others are either similar or derived from them.
The first two are even handled by the same index.
The index for the two-parameter \enclose{} is then implemented using \rmqi.

\subsection{Structure for \match{} and \enclose{}}\label{s:match-enclose}

We describe a succinct index which supports \findClose{}, \findOpen{}, and \enclose{} with one parameter.
We follow the description given by \cite{geary2006simple}; other options are summarized in \cite{arroyuelo2010succinct}.

\bigbreak

First, we cover the balanced bit string by blocks of size $B = \frac{\log N}{2}$.
For each position $i$ we denote $B(i)$ the block to which it belongs.

\subsubsection{Pioneers}

We provide definitions of special parentheses:
\begin{description}
	\item[far, near] 
	A parenthesis $i$ is \emph{far} if $B(i) \ne B(\match(i))$; otherwise we call it \emph{near}.
	Note that the matching parenthesis of a far parenthesis is also a far parenthesis.
	
	We observe that each block contains first all closing far parentheses and then all opening far parentheses.
	
	\item[opening (closing) pioneer]
	An opening (closing) far parenthesis $i$ is an \emph{opening (closing) pioneer} if the matches of $i$ and of a preceding opening (following closing) far parenthesis $j$ are located in different blocks:
	$$ j < i \booland B(\match(j)) \ne B(\match(i)). $$
	
	Note that a matching parenthesis of an opening (closing) pioneer does not have to be a closing (opening) pioneer.
	
	\item[pioneer]
	A \emph{pioneer} is either an opening or closing parenthesis pioneer or its matching parenthesis.
	A \emph{pioneer pair} is a pair of matching parentheses which are pioneers.
	
	Note that pioneers form a subsequence of parentheses which are correctly matched.
	Also note that the first opening far parenthesis and the last closing far parenthesis in a block are pioneers.
\end{description}

\begin{example}
	In the following bit strings $S$ we mark the positions of the far parentheses (\texttt{focp}), opening pioneers (\texttt{o}), closing pioneers (\texttt{c}), and pioneers (\texttt{ocp}):
	$$ S = \mathtt{\x(o \x(f ( ( ) ) \mid \x(p ( ) ( ) \x(f \mid \x)f ( ) \x)c \x)f \x)c} $$
\end{example}

\begin{lemma}
	There are $O\left(\frac{N}{B}\right)$ pioneers.
\end{lemma}
\begin{proof}
	For every pair of blocks there exists at most one pioneer pair.
	This is certainly true if the opening and closing pioneers are considered separately.
	Let's assume there are two pioneer pairs between this pair of blocks: one with an opening pioneer, the other one with a closing pioneer.
	Because the pioneer parentheses are correctly matched, one pair is enclosed by the other one, and therefore the opening nor the closing parenthesis of the enclosed pair cannot be pioneers by definition.
	
	Let's consider a graph whose vertices are blocks of the bit string and edges are between the blocks which are connected by a pioneer pairs.
	Such graphs is an outerplanar graph with a bound on number of edges: $|E| \le 2 |V| - 3$ while $|V| = O\left(\frac{N}{B}\right)$.
	There are at most $E$ pioneer pairs, and therefore at most $2 |E|$ pioneers.
\end{proof}

\subsection{Block Queries}

We will use two similar look-up table to answer all queries within a block.
\begin{itemize}
	\item $\fwdSearch[S, i, d, \paren, \far]$ returns the first position $j \ge i$ for which holds that $\excess(S, j) = \excess(S, i) + d$ and $S[j] = \paren$.
	If $\far = \true$, then $j$ must be a far parenthesis.
	\item $\bwdSearch[S, i, d, \paren, \far]$ is the same except for $j \le i$.
\end{itemize}
The look-up tables return $-1$ if such position does not exist in the queried block.

There are two special cases which we address separately:
\begin{description}
	\item[A block query for a matching parenthesis]
	We distinguish two cases depending on the parenthesis $S[i]$:
	$$\match(S, i) = \begin{cases}
		\fwdSearch[S, i, -1, \closingParen, \false] & \textif S[i] = \openingParen; \\
		\bwdSearch[S, i, 1, \openingParen, \false] & \textotherwise.
	\end{cases}$$
	
	\item[A block query for an enclosing parenthesis]
	We assume that $S[i]$ is an opening parenthesis since the other case will never occur.
	We run two queries for which we return the first non-negative result:
	\begin{enumerate}
		\item $\bwdSearch[S, i, -1, \openingParen, \false]$ which returns $\enclose(S, i),$
		\item and $\fwdSearch[S, i, -2, \closingParen, \false]$ returning $\match(\enclose(S, i))$.
	\end{enumerate}
\end{description}

\bigbreak

We aim to reduce the queries from the original bit string $S$ to queries on a bit string $P$ consisting of pioneers.
We are allowed to store the array $P$ since its size is $O\left(\frac{N}{\log N}\right) = o(N)$.

We will also use a structure which tells us the positions of pioneers in the bit string $S$.
A naïve approach would be to use a bit string $P'$ marking the positions of pioneers: $P'[i] = 1 \iff i\ \textrm{is pioneer}$. We equip $P'$ with indices for $\rank_1$ and $\select_1$.
The problem is that the size of the bit string $P'$ is $N$ instead of $o(N)$.
To address this problem, we use a \emph{fully indexable dictionary} which we introduce in the next section.

\subsubsection{Reduction of \findClose}

The operation $\findClose(S, i)$ is performed using $\findClose(P, i')$ as follows:
If $S[i]$ is a close parenthesis, we return $i$.
We use the look-up table to find out whether the answer exists in the block $B(i)$ and possibly return it.

Otherwise, $i$ is an opening far parenthesis.
Either $i$ is pioneer or we find the preceding pioneer; we denote it $j$ in both cases.
It must be an opening parenthesis because the first opening far parenthesis in a block is such and there cannot be another pioneer pair between $i'$ and $i$.
We find the match of $i'$ as:
$$ k = \findClose(S, i') = \select_1(P', \findClose(P, \rank_1(P', i'))). $$
If $i = i'$, then $k$ is the answer which we return.

Otherwise, we know that $B(k) = B(\findClose(S, i')) = B(\findClose(S, i))$; else $i$ would have been a pioneer.
To find the answer within the block $B(k)$, we use a look-up table \bwdSearch{} to find the far parenthesis $j$ with the right excess difference $d = \excess(i) - \excess(i')$.
The parenthesis $j$ is guaranteed to exist.

\begin{algorithm}
\begin{algorithmic}
\Function{\findClose}{$S, i$}
	\If{$S[i] = \closingParen$}
		\State \Return{$i$}
	\Else
		\State $j' \gets \fwdSearch[S[B(i)B: (B(i) + 1)B], i, -1, \closingParen, \false]$
		\If{$j' \ne -1$}
			\State \Return{$B(i) B + j'$} \Comment{Same block}
		\Else
			\State $i' \gets \pred_1(P', i)$
			\State $k \gets \select_1(P', \findClose(P, \rank_1(P', i')))$ \Comment{Recursion}

			\State $d \gets \excess(i) - \excess(i')$
			\State $j' \gets \bwdSearch[S[B(k)B: (B(k) + 1)B], k \% B, d, \closingParen, \true]$
			\State \Return{$B(k) B + j'$}
		\EndIf
	\EndIf
\EndFunction
\end{algorithmic}
\end{algorithm}

The operation \findOpen{} is reduced similarly; together they provide the operation \match{}.

\subsubsection{Operation \enclose}

We show how to perform the operation $\enclose(S, i)$.
We assume without loss of generality that $i$ is an opening parenthesis.

If the answer to $\enclose(S, i)$ is within the block $B(i)$, we use a look-up table to report it, and stop.
If the answer to $k = \enclose(S, \findClose(i))$ is within the look-up table, we report it, and stop.
The parenthesis found by the look-up table can be return a closing parenthesis, which for which we find its matching opening parenthesis before reporting the answer.

Otherwise we aim for recursion.
The parenthesis pair $(j, \match(j))$ tightly enclosing $i$ is not contained in the block $B(i)$, therefore both parentheses must be far.
Since there exists an edge between blocks $B(j)$ and $B(\match(j))$, there must exist exactly one pioneer pair $(f, \match(f))$ connecting these blocks.
We find the parenthesis $f$ depending on the nearest pioneer $i' = \succ_1(P', i)$, which is enclosed by $i$ and therefore also by $j$:
\begin{enumerate}
	\item $S[i']$ is a closing parenthesis.
	Then its matching opening parenthesis is at position $f = \findOpen(i') < i$ and $f$ is the opening parenthesis of the pioneer pair for which we were looking.
	It cannot happen that $f \ge i$ because $f$ is a pioneer and it would have otherwise been found instead of $i'$ by the successor query.
	\item $S[i']$ is an opening parenthesis.
	Then $f < i \le i' < \findClose(i) < \match(f)$.
	At the same time $f = \enclose(i')$, which we solve by recursion.
\end{enumerate}
Once we have $f$, we continue in the similar way as in case of $\findClose$ -- we use a look-up table to find the parenthesis with the right excess in the block $B(f)$.

\begin{algorithm}
\begin{algorithmic}
\Function{\enclose}{$S, i$}
	\If{$S[i] = \closingParen$}
		\State $i \gets \findOpen(i)$
	\EndIf

	\State $j' \gets \bwdSearch[S[B(i)B : (B(i)+1) B], i \% B, -1, \openingParen, \false]$
	\If{$j' \ne -1$}
		\State \Return{$B(i) B + j'$} \Comment{Opening in the same block}
	\EndIf
	
	\State $i'' \gets \findClose(i)$
	\State $j' \gets \fwdSearch[S[B(i'')B : (B(i'')+1) B], i'' \% B, -2, \closingParen, \false]$
	\If{$j' \ne -1$}
		\State \Return{$\findOpen(B(i'') B + j')$} \Comment{Closing in the same block}
	\EndIf
	
	\State

	\State $i' \gets \succ_1(P', i)$
	\If{$S[i'] = \closingParen$}
		\State $f \gets \findOpen(i')$
	\Else
		\State $f \gets \select_1(P', \enclose(P, \rank_1(P', i')))$ \Comment{Recursion}
	\EndIf
	
	\State $d \gets \excess(i) - \excess(f) - 1$
	\State $j' \gets \bwdSearch[S[B(k)B: (B(k) + 1)B], k \% B, d, \openingParen, \true]$
	\State \Return{$B(k) B + j'$}
\EndFunction
\end{algorithmic}
\end{algorithm}

\subsubsection{Recursion}

The recursion as it was defined reduces the query from a bit string of size $N$ to one of size $O(\frac{N}{\log N})$.
After $t$ levels the bit string has size $O\left(\frac{N}{\log^t N}\right)$; we could use $t = O\left(\frac{\log N}{\log\log N}\right)$ to reduce the size to $O(1)$ which would guarantee that the query fits in a single block.
However that would result in a superconstant time complexity of the operation.

We instead require only a constant number of levels of the recursion, $t = 2$ is sufficient.
For every position in a bit string of size $O\left(\frac{N}{\log^2 N}\right)$ we can precompute the answers to both operations \match{} and \enclose{}; such table $T$ has size $O\left(\frac{N}{\log^2 N} \log \frac{N}{\log^2 N}\right) = O\left(\frac{N}{\log N}\right) = o(N)$.
Note that this table is not a universal look-up table which could be shared among multiple instances of the data structure.

We do not even need to represent the bit string $P$ on the second level since we only need the index to the table $T$ provided by the indexable dictionary $P'_2$.

\bigskip

All space complexities so far are:
\begin{itemize}
	\item $O\left(\frac{N}{\log N}\right)$ -- the bit string $P$ on the first level;
	\item $O\left(\frac{N}{\log N}\log \log N\right)$ -- the fully indexable dictionary $P'$ on the first level;
	\item $O\left(\frac{N}{\log^2 N}\log \log N\right)$ -- the fully indexable dictionary $P'_2$ on the second level;
	\item $O\left(\frac{N}{\log N}\right)$ -- the precomputed table $T$ on the second level;
	\item $O\left(\sqrt{N}\log^2 N \log\log N\right)$ -- the universal look-up tables \fwdSearch{} and \bwdSearch{}.
\end{itemize}
Note that we have no special requirements on the operations supported by $S$ and $P$; we only need to access up to $B$ consecutive bits, which bit strings allow.

\subsection{Index for Range Minimum Query}

The problem of \emph{range minimum query} has applications in many areas and therefore is well studied: \cite{bender2000lca, fischer2010optimal, durocher2013simple}.
In this section we first define a more general problem (as did \cite{sadakane2010fully}) which we fully solve in a later chapter.
We then show a simple succinct index which will still be sufficient in most cases and will be needed for the general solution.

Although range minimum query can is defined for an arbitrary bit string (or even an array of numbers), it will be useful only for balanced bit strings.
That is the reason why we present it here.

\subsubsection{$\pm 1$ Functions}\label{ss:rmq-def}

We first describe all possible functions $g$ which are mapping the values of bits $\{0, 1\}$ into a set $\{-1, 0, 1\}$.
We call all the functions $g$ as $\pm 1$ functions.
They are:
\begin{itemize}
	\item $\phi(b) = b$,
	\item $\psi(b) = 1 - b$,
	\item $\pi(b) = 2 b - 1$,
	\item inverses ($-\phi(b), -\psi(b), -\pi(b)$) and constant functions mapping each value of $b$ to the same fixed value.
	They are mentioned only to clarify that there are $3^2$ functions in total.
	These functions will not be useful for us.
\end{itemize}

We define an array $G$ for a function $g$ as:
$$ G[i] = \begin{cases}
	\sum_{k = 0}^i g(S[i]) & \textif i \ge 0; \\
	0 & \textotherwise.
\end{cases} $$

\bigbreak

We redefine the operation \rmqi{} and \RMQi{} to use the array $G$ instead of excesses of the array $S$.
Range Minimum Query $\rmqi(G, i_1, i_2)$ returns $j$ such that $G[j] \le G[k] \ \forall i_1 \le k \le i_2$ and $j$ is the smallest such position.
Similarly we define Range Maximum Query $\RMQi(G, i_1, i_2)$ returning the leftmost position of the maximum value of $G$ in range $[i_1, i_2]$.
We also define function $\rmq(G, i_1, i_2)$ and $\RMQ(G, i_1, i_2)$ returning the value rather than the position.

We can see that only the function $\pi$ is useful as the others are monotonic and therefore the range minimum is at the position $i_1$ and maximum at the position $i_2$.
An alternative definition of the array $G$ applied to $\pi$ (which we call $E$) uses the excess function:
$$ E[i] = \sum_{k=0}^i \pi(S[i]) = \rank_1(i) - \rank_0(i) = \excess(i). $$

\bigbreak

Although we defined the array $G$ to be derived by an application of a function on individual bits of $S$, we only need that $O(\log N)$ consecutive elements of the array $G$ can be derived from $O(\log N)$ consecutive bits of $S$.
A function which we could use instead of \excess{} is for example \parenDepth{}.
It differs from \excess{} by one at positions of opening parentheses and it can be expressed as a function depending on two consecutive bits:
$$\delta(a, b) = 
\begin{cases}
	-1 &\textif a = 0 \booland b = 0; \\
	0 &\textif a = 0 \booland b = 1; \\
	0 &\textif a = 1 \booland b = 0; \\
	1 &\textif a = 1 \booland b = 1.
\end{cases} $$
An array $D$ of depths of parentheses is then defined as:
$$ D[i] = \sum_{k=0}^i \delta(S[i - 1], S[i]) = \parenDepth(i).$$

The property that $| G[i] - G[i-1] | \le 1$ will be useful much later when we introduce a more complicated and also more versatile structure.

\subsubsection{A Simple Index}\label{sss:rmq-index}

We build a succinct index supporting range minimum/maximum queries for the array of excesses.
We only show the case of \rmqi{} as \RMQi{} is realized the same way; also \rmq{} and \RMQ{} can be implemented straightforward using \rmqi{} as \RMQi{} and \excess{}.
The array $E$ is never stored explicitly.

When we solve queries on range $[i, j]$ on top of an array which is decomposed into blocks, we split it into three parts based on which blocks are fully contained in the range.
Since the functions $\min$ and $\max$ are distributive over a concatenation of ranges, each part of the original range can be processed independently.
We call the three parts, which together form the original range:
\begin{description}
	\item[prefix] the non-full block containing $i$;
	\item[suffix] the non-full block containing $j$;
	\item[span] the interval of full blocks between $i$ and $j$.
\end{description}
Any of prefix, suffix or span (or more of them) can be empty in a query.

There is a special case of a range which fully is contained within a single block; we solve it separately using the following lemma.
\begin{lemma}\label{lemma:rmq1}
	We can solve the operation $\rmqi(E, i_1, i_2)$ in constant time whenever $i_2 - i_1 + 1 < \frac{\log N}{2}$.
\end{lemma}
\begin{proof}
	We can express every element of $E$ as:
	$$ E[i_1 - 1] + \sum_{k = 0}^{i_2 - i_1} \pi(S[i_1 + k]). $$
	All the values depend only on the block $S[i_1 : i_2 + 1]$ and the value of $E[i - 1]$.
	Because the minimum is independent of the absolute value, the dependence on $E[i - 1]$ is irrelevant.
	
	By the precondition of the lemma, the block has a size of at most $\frac{\log N}{2}$, which makes it possible to use it as an index to a look-up table \rmqi.
	The look-up table is parametrized by the length of the block, which is the value of $j - i + 1$, which is encoded in $O(\log \log N)$ bits.
	The parameter is necessary because of the zero-padding of words in our definition of RAM.
	The table simply returns the position of the minimum.
\end{proof}

\bigbreak

We follow with a lemma which will solve the \rmqi{} for spans.
Although it has a large space complexity depending on the number of blocks, eventually, we will be able to lower it to $o(N)$ by the right choice of block sizes.

\begin{lemma}\label{lemma:rmq2}
	Given an array $P$ of $p$ positions of minima in blocks on a lower level, we can solve the \rmqi{} for span queries in constant time using $O(p \log^2 p)$ bits of memory.
\end{lemma}
\begin{proof}
	We call $l = i_2 - i_1 + 1$ the number of blocks over which the query spans.
	We distinguish two cases:
	\begin{enumerate}
		\item $l = 2^k$ is a power of two.
		We simply use a value from a precomputed table $\Tm$ which records the index of the block containing the minimum for each $i$ and $k$.
		This table of $O(p \log p)$ elements require $O(p \log^2 p)$ bits in total.
		
		\item $l$ is not a power of two.
		We substitute the query on $l$ blocks by two queries each spanning $2^k$ blocks where $k = \lfloor \log l \rfloor$.
		The two queries overlap, however it does not cause any issue since we are only looking for a minimum of their answers.
		We gather both candidates and return the one which has the smaller value, while preferring the left one.
		
		\begin{algorithm}
		\begin{algorithmic}
		\Function{\rmqiSpan}{$E, P, i_1, i_2$} \Comment{$i_1, i_2$ are block numbers.}
			\State $k \gets \lfloor \log (i_2 - i_1 + 1) \rfloor$
			\State $p_1 \gets \Tm[i_1, k]$
			\State $p_2 \gets \Tm[i_2 - 2^k, k]$
			\If{$E[P[p_1]] \le E[P[p_2]]$}
				\State \Return{$p_1$}
			\Else
				\State \Return{$p_2$}
			\EndIf
		\EndFunction
		\end{algorithmic}
		\end{algorithm}
	\end{enumerate}
\end{proof}

The bit string $S$ is covered by blocks of size $B$ and small blocks of size $b = \frac{\log N}{2}$.
For each block we store the position of its minimum in an array $P_1$, which requires $O\left(\frac{N}{B} \log B\right)$ bits of memory.
We build the \rmqi{} structure by lemma \ref{lemma:rmq2} on top of this array, which adds another $O\left(\frac{N}{B} \log^2 \frac{N}{B}\right)$ bits of memory.

For all small blocks in a block $i$ we precompute an array $P_2[i]$ of positions of their minima; this array uses $O\left(\frac{B}{b} \log \log N\right)$ bits per block.
On top of this array we build again the \rmqi{} structure, which uses $O\left(\frac{B}{b} \log^2 \frac{B}{b}\right)$ bits per block.

In order to keep all densities $o(1)$, we set $B = \log^3 N$.

\subsubsection{Algorithm}

When we process the query, we first split the interval $[i_1, i_2]$ into:
\begin{enumerate}
	\item up to one top-level span of blocks.
	The prefix and suffix are passed to the lower level.
	\item up to two spans of small blocks, which form parts of the prefix and suffix from the top level;
	\item up to two small blocks, which are not fully covered by the interval.
\end{enumerate}
In the special case when the range is contained within a single block or a single small block, the range is passed to the lower level.

We gather the candidates for minimum in (1) and (2) by querying the \rmqi{} structures and in (3) by using a look-up table as described by lemma \ref{lemma:rmq1}.
Once we have all candidates $C$, we return the one with the lowest value:
$$ j = \argmin_{c \in C} E[c]. $$

We omit the pseudo-code of the algorithm as it deals with many cases which are essentially the same.

\subsubsection{\label{ss:enclose2}Two-parameter \enclose}

We use the operation \rmqi{} to support the two-parameter \enclose{}.

Without loss of generality, we assume that both $i_1$ and $i_2$ are opening parentheses and that $i_1 \le i_2$.
We first check if $i_2$ is enclosed by $i_1$; in such case we simply reduce the operation to a one-parameter \enclose{}.

Therefore the parentheses pair of $i_1$ does not contain $i_2$ and vice versa.
By definition, we are looking for a parentheses pair $(j, \match(j))$ which spans over the interval from $\min(i_1, i_2)$ to $\match(\max(i_1, i_2))$.
The parentheses pair of $j$ contains two parentheses pairs $p_1 < p_2$ such that each of them contains one of $i_1$, $i_2$ and $\excess(p_1) = \excess(p_2) = \excess(j) + 1$.

We observe that the following properties hold for $\findClose(p_1)$:
\begin{itemize}
	\item it is contained in the interval $[i_1, i_2]$;
	\item it has the minimum \excess{} on such interval;
	\item it is the leftmost parenthesis with such \excess{}.
\end{itemize}
The second property follows from the fact that it is a closing parenthesis and that:
$$ \excess(k) \ge \excess(j) \ \forall j \le k < \match(j). $$

We also observe that $\findClose(p_1) + 1$ is an opening parenthesis of the next parentheses pair following $p_1$.
These properties allow us to find a parenthesis which is tightly enclosed by $j$ and reduce the query to a one-parameter enclose:
$$ j = \enclose(i_1, i_2) = \enclose(\rmqi(i_1, i_2) + 1). $$

\section{Dictionaries}

The bit strings which we have been discussing so far can be seen from a different point of view:
A bit string $S$ of size $N$ is a representation of a set $A$, which is a subset of $[0, N)$ such that a number $x \in A \iff S[x] = 1$.

The natural encoding is the one which we have used -- representing the membership of each numbers in the universe $[0, N)$ by a bit in a characteristic vector.
It is still true that such encoding is succinct as there are $| \mathcal{A}(N) | = 2^{|[0, N)|} = 2^N$ possible sets in the universe, and therefore $N$ bits is required.

However, we can restrict the subsets $A$ of $[0, N)$ by the number of elements $K$.
There are ${N \choose K}$ sets in the parametrized universe requiring $\log {N \choose K}$ bits of memory.
From this point of view of sets parametrized by the number of elements, the encoding by the membership of each element is not succinct.

\bigbreak

We define structures supporting various operations, which we already know from the general bit strings.
\begin{description}
	\item[dictionary]
	A \emph{dictionary} is a data structure which stores $A$ and can answer membership queries $x \in A$.
	
	\item[indexable dictionary]
	An \emph{indexable dictionary} (ID) is a dictionary which can answer \rank{} for elements of the set $A$ and \select{} queries.
	\begin{align*}
		\rank(A, i) &= |\{ a \in A : a \le i \}| \\
		\select(A, j) &= i \in A \booland \rank(A, i) = j \\
	\end{align*}
	
	\item[fully indexable dictionary]
	A \emph{fully indexable dictionary} (FID) is an indexable dictionary for both sets $A$ and its complement $\overline{A}$.
	
	We can extend the $\rank(A, i)$ for all $i \in [0, N)$:
	$$ \rank(A, i) = \begin{cases}
		\rank(A, i) & \textif i \in A; \\
		i - \rank(\overline{A}, i) + 1 & \textotherwise.
	\end{cases}$$
	
	Moreover, all operations \pred{}, \succ{}, \prev{}, \nextt{} are well defined for FID.
	FID is equivalent to a general bit string.
\end{description}

We will use the notation from general bit strings:
\begin{align*}
	\rank_1(A, i) &= \rank(A, i), \\
	\rank_0(A, i) &= \rank(\overline{A}, i), \\
	\select_1(A, j) &= \select(A, j), \\
	\select_0(A, j) &= \select(\overline{A}, j). \\
\end{align*}

\bigbreak

The first and only succinct indexable dictionary was developed by \cite{raman2007succinct} in Theorem~4.1.
We state their result as a lemma without proof.

\begin{lemma}\label{l:succint-indexable-dictionary}
	There is a succinct indexable dictionary which uses $\log {N \choose K} + o(K) + O(\log \log N)$ bits.
\end{lemma}

\subsection{Sublinear Fully Indexed Dictionary}\label{ss:sublinear-fid}

We show a simple fully indexable dictionary of a size $\log {N \choose K} + o(N)$.
In all cases when we use this structure, the space complexity will be $o(N)$, which is the reason for its name.
Note that if $N = 2K$, then the size is $N - O(\log N)$; balanced bit strings are a special case of such setup.

This structure is based on Lemma~4.1 in \cite{raman2007succinct}.
The difference is that we reuse the succinct indices for \rank{} and \select{} which we showed in section \ref{s:op-bs} instead of implementing them again.

This FID is also the missing piece in the structure for \match{} and \enclose{} in section \ref{s:match-enclose}.

\bigbreak

The main idea of the data structure is to provide an access to $B = \frac{\log N}{2}$ bits of the characteristic vector.
Since any algorithm can access up to $w = O(S)$ bits in a single step, we can provide it with a constant slowdown.

We split the characteristic vector $S$ into blocks $S_1, S_2, \ldots$ of size $B$ with each of them having $K_1, K_2, \ldots$ ones.
We extend $N$ so that the last block is full; this difference is negligible.

We represent each block $i$ implicitly by a binary number $\big[0, {B \choose K_i}\big)$ encoded in $b_i \le S$ bits.
The space bound for storing all blocks consecutively as a bit string $S'$ follows from the generalized Chu-Vandermonde's identity (\cite{belbachircombinatorial}).
\begin{align*}
	\sum_{K_1 + K_2 + \dots = K} \prod_i {B \choose K_i} &= {N \choose K} \\
	\sum_i \log {B \choose K_i} &\le \log {N \choose K} \\
	\sum_i b_i &\le \log {N \choose K} + \frac{N}{B}
\end{align*}

We also store two arrays of size $O\left(\frac{N}{B} \log B\right)$:
\begin{iteminline}
	\item $L = [K_i \forall i]$,
	\item $C = [b_i \forall i]$.
\end{iteminline}

We can now restore any block $i$ into its characteristic vector by a simple look-up table provided that we know where the representation of the block starts.
We group $\log N$ blocks into macro-blocks and store where each macro-block starts in an array $\glob$ using $O\left(\frac{N}{\log^2 N} \log N\right)$ bits.
And for each macro-block we store the macro-block-local positions of beginning of blocks which it contains in an array $\block_i$ using $O(\log N \log \log N)$ bits per macro-block.
This is essentially the same as the partitioning which we used for \rank{} in section \ref{ss:rank}.

\begin{algorithm}
\begin{algorithmic}
\Function{\characteristicBlock}{$S', b$} \Comment{$b$ is the block number.}
	\State $m = \frac{b}{\log N}$
	\State $p \gets \glob[m] + \block_m[b \% \log N]$
	\State \Return{$\characteristicBlock[S'[p : p + C[b] + 1], L[b]]$}
\EndFunction
\end{algorithmic}
\end{algorithm}

The implementation of the operations \rank{}, \select{}, and \inspect{}, which are required by the definition of FID, is straightforward.
The total size of the structure is ${N \choose K} + O\left(\frac{N \log \log N}{\log N}\right) = {N \choose K} + o(N)$.

\subsection{Compressed Array}\label{s:compressed-array}

In an array, a \emph{run} is a consecutive sequence of identical elements.

Let's assume that we have an array $A$ of $a$ elements od size $s$ which contains $r = o(a)$ runs.
The goal will be to store $A$ in space $o(a)$.

We can \emph{compress} such array $A$ by storing:
\begin{description}
	\item[\AFID]
	A fully indexable dictionary containing the positions of the last elements of each run:
	$$ \AFID = [i : A[i + 1] \ne A[i]]. $$
	The FID contains $r$ values resulting in the space complexity $\log {a \choose r} + o(a)$.

	\item[\AElems]
	An array $A$ with each run reduced to a single (last) element:
	$$ \AElems = [A[i] : i \in \AFID]. $$
	The array contains $r$ elements of total size $rs$ bits.
	
	\item[\ABefore]
	An array containing the numbers of occurrences of the element $A$ before the current run:
	$$ \ABefore = [|\{ j : j < \prev_1(\AFID, i) \booland A[j] = A[i] \}| : i \in \AFID ]. $$
	The size of the array is $r \log a$.
\end{description}

We define several operation for this compressed array:
\begin{description}
	\item[\elementIndex{}]
	Returns the number of the run which contains position $i$:
	$$ \elementIndex(A, i) = \rank_1(\AFID, \succ_1(\AFID, i)). $$

	\item[\inspect{}, $A[i\char93$]
	It provides access to any element:
	$$ A[i] = \AElems[\elementIndex(A, i)]. $$

	\item[\runFirst{}, \runLast{}, \runLength{}]
	It returns the position of the first and last elements of the run containing~$i$.
	\runLength{} returns the length of the run.
	They are defined as:
	\begin{align*}
		\runFirst(A, i) &= \prev_1(\AFID, i) + 1, \\
		\runLast(A, i) &= \succ_1(\AFID, i), \\
		\runLength(A, i) &= \runLast(A, i) - \runFirst(A, i) + 1.
	\end{align*}
	
	\item[\rank{}]
	Returns how many elements same as $A[i]$ there has been until the position~$i$:
	$$ \rank(A, i) = \ABefore[\elementIndex(A, i)] + (i - \runFirst(A, i) + 1). $$
	
	\item[\size{}]
	Returns the total number of elements $a$.
\end{description}

\subsubsection{Extending Compressed Array}

A compressed array can be built on top of several arrays $A_0, A_2, \ldots, A_{t-1}$ with sizes $a_i$ containing $r_i$ runs.
We refer to the structure as a collection of compressed arrays.
The total space complexity follows a simple summation: $a = \sum_i a_i$ and $r = \sum_i r_i$.

We extend the structure of the compressed array by the following field:
\begin{description}
	\item[\AParts{}]
	An array containing $t + 1$ elements of positions where $A_i$ starts:.
	\begin{align*}
		\AParts[0] &= 0, \\
		\AParts[i] &= \sum_{j = 0}^{i - 1} |A_{j}|.
	\end{align*}
	The size of \AParts{} is the same as the size of \ABefore{}, which was built on top of an array of the same size containing a single part.
\end{description}

The compressed array is then built on top of a concatenated array
$$A = A_0 \cdot A_2 \cdots A_{t-1}$$
with the exception that the first element of an array $A_i$ always starts a new run.
This makes sure that runs do not extend over several parts, which would make it harder, yet not impossible, to handle.
This is already incorporated in the total number of runs: $r = \sum_i r_i$.

Using this extra array, which has same space complexity as \ABefore{}, we can support the operations on part $p \in [0, t)$ which differs by an initial offsetting of $i$:
$$i' = \AParts[p] + i.$$
Note that we can also answer \size{} of each part, and check the bounds before an operation is commenced.

\bigbreak

We define a way how to turn a partially filled array $A'$ with a characteristic vector $C$, which is usually obvious from the context and definition of $A'$, into a compressed array $A$ for a given operation \op{} which is one of \pred{}, \succ{}, \prev{}, \nextt{}.
The missing elements in the array are defined as:
$$A[i] = A'[\op(C, i)].$$

\subsection{Tiny Compressed Array}

The compressed array, as we defined it using the FID, is going to be sufficient for most cases.
However it is impractical in situations when the number of runs is very small since the size of the FID depends polynomially on the size of the universe $a$ rather than on the number of elements $r$ which are represented by the set. 
We are interested in developing a more space-efficient structure for $r = O(\log a)$.

The polynomial dependency of FID on the size of the universe $a$ is not only problem of our quite simple data structure.
In fact there is no known FID which does not suffer with this problem.
The two state of the art data structures \cite{patrascu2008succincter, grossi2009more} have a similar polynomial dependency.

We overcome the problem by replacing the FID by two more advanced data structures which together provide the same result under this restricted $r$.
The resulting space complexity will be $O(\log^2 a)$ bits.

\bigbreak

The first structure which we need is a \emph{fusion tree} which was described by \cite{fredman1993surpassing} and proven optimal for our case by \cite{puatracscu2006time}.
We state their result as lemma which we do not prove.
\begin{lemma}
	A fusion tree stores $n$ $l$-bit integers and supports \pred{} and \succ{} operations in time $O(\log_l n)$ requiring space $O(n l)$ bits.
\end{lemma}

The compressed array requires only operations \prev$_1$, \succ$_1$ and \rank$_1$; the fusion tree provides the first two.
Applied to our problem of a set containing $O(\log a)$ values, we uses $O(\log^2 a)$ bits and answers the operations in time $O(1)$.

In order to support the \rank$_1$, we use the indexable dictionary from lemma \ref{l:succint-indexable-dictionary}.
Although, its \rank{} is not universal -- it is restricted to elements of the dictionary, in combination the \pred{} operation of the fusion tree, it can still be supported.

\bigbreak

It is worth noting, that the tiny compressed array is a theoretical data structure as it can be replaced by a simple sorted array.
The operations \rank{}, \select{}, \pred{}, and \succ{} can be then processed in time $O(\log \log a)$, which for all practical purposes ($\log \log a \le 6$) is negligible.