Skip to content

Commit fea8b05

Browse files
amotlmatriv
andcommitted
Indexing and storage: Copy editing
- Mention the fast sorting and aggregations that the doc values mechanism provides. - Improve "doc values" section. - Trim "introduction" section. - Remove "summary" section. - Fix tables. - Various copy editing. Wording. Naming things. - Wrap images into white background for accompanying dark mode. - Implement suggestions from code review, also by CodeRabbit Co-authored-by: Marios Trivyzas <[email protected]>
1 parent fb78a63 commit fea8b05

File tree

1 file changed

+85
-71
lines changed

1 file changed

+85
-71
lines changed

docs/feature/storage/indexing-and-storage.md

Lines changed: 85 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -17,40 +17,28 @@ class-container: sd-p-2 sd-outline-muted sd-rounded-1
1717

1818
## Introduction
1919

20-
In this article series, we look at CrateDB from different perspectives. We start
21-
from the bottom of CrateDB architecture and gradually move up to higher layers,
22-
presenting the most important aspects of CrateDB internals. The motivation is to
23-
better understand CrateDB, as well as to aid users in maximizing the
24-
effectiveness of CrateDB features.
25-
26-
In the first part, we explore the internal workings of the storage layer in
27-
CrateDB. The storage layer ensures that data is stored in a safe and accurate
28-
way and returned completely and efficiently. The CrateDB storage layer is based
29-
on Lucene indexes. Lucene offers scalable and high-performance indexing which
30-
enables efficient search and aggregations over documents and rapid updates to
31-
the existing documents. We will look at the three main Lucene structures that
32-
are used within CrateDB: Inverted Indexes for text values, BKD-Trees for numeric
33-
values, and Doc Values.
20+
This article explores the internal workings of the storage layer in CrateDB.
21+
The storage layer ensures that data is stored in a safe and accurate
22+
way and returned completely and efficiently.
23+
CrateDB's storage layer is based on Lucene indexes.
3424

3525
## What's inside
3626

37-
This article explores the internal workings of the storage layer in CrateDB,
38-
with a focus on Lucene's indexing strategies.
27+
Lucene offers scalable and high-performance indexing, which enables efficient
28+
search and aggregations over documents and rapid updates to the existing
29+
documents. We will look at the three main Lucene structures that are used
30+
within CrateDB: Inverted indexes for text values, BKD trees for numeric
31+
values, and doc values.
3932

40-
The CrateDB storage layer is based on Lucene indexes. Lucene offers scalable and
41-
high-performance indexing which enables efficient search and aggregations over
42-
documents and rapid updates to the existing documents. We will look at the three
43-
main Lucene structures that are used within CrateDB: Inverted Indexes for text
44-
values, BKD-Trees for numeric values, and Doc Values.
33+
:Inverted index: Understand how inverted indexes are implemented in Lucene
34+
and how CrateDB uses them to index text values and enable fast text searches.
4535

46-
:Inverted Index: You will learn how inverted indexes are implemented in Lucene
47-
and CrateDB, and how they are used for indexing text values.
48-
49-
:BKD Tree: Better understand the BKD tree, starting from KD trees, and how this
36+
:BKD tree: Understand the BKD tree, starting from KD trees, and how this
5037
data structure supports range queries on numeric values in CrateDB.
5138

52-
:Doc Values: This data structure supports more efficient querying document
53-
fields by id, performs column-oriented retrieval of data, and improves the
39+
:Doc values: This data structure
40+
enables efficient queries by document field name,
41+
performs column-oriented retrieval of data, and improves the
5442
performance of aggregation and sorting operations.
5543

5644
## Indexing text values
@@ -95,7 +83,9 @@ example:
9583
The diagram below shows the indexing terms from two documents, the sorted
9684
sequence, and finally the index.
9785

86+
:::{div} sd-bg-white
9887
![Indexing terms sorted sequence and index](https://crate.io/hubfs/Sequence-of-terms-Sorted-Sequence-Index.png)
88+
:::
9989

10090
### Lucene segments
10191

@@ -122,9 +112,8 @@ separately.
122112
To illustrate both indexing methods, let’s consider a simple table called
123113
*Product*:
124114

125-
| | | |
126-
| ------------- | ------------ | ------------ |
127115
| **productID** | **name** | **quantity** |
116+
| ------------- | ------------ | ------------ |
128117
| 1 | Almond Milk | 100 |
129118
| 2 | Almond Flour | 200 |
130119
| 3 | Milk | 300 |
@@ -166,10 +155,12 @@ Lucene 6.0 adds an implementation of Block KD (BKD) tree data structure.
166155

167156
### BKD tree
168157

169-
To better understand the BKD tree data structure, let’s start with a short
158+
To better understand the BKD tree data structure, let's begin with an
170159
introduction to KD trees. A KD tree is a binary tree for multidimensional
171160
queries. KD tree shares the same properties as binary search trees (BST), but
172-
the dimensions alternate for each level of the tree. For instance, starting from
161+
the dimensions alternate for each level of the tree.
162+
163+
For instance, starting from
173164
the root node, the x value of the left nodes is always less than the x value of
174165
the root node. The same applies to the right node and all intermediate nodes up
175166
to leaf nodes. KDB tree is a special kind of KD tree with properties found in
@@ -211,7 +202,9 @@ construction process is as follows:
211202
construction process stops. Finally, the KDB tree is constructed as
212203
illustrated in the figure below:
213204

205+
:::{div} sd-bg-white
214206
![KDB tree divided by y dimension](https://crate.io/hubfs/divded-by-y-dimension.png)
207+
:::
215208

216209
The index file with the resulting data structure is then created as a series of
217210
blocks that contain data from leaf nodes, intermediate nodes, and the metadata
@@ -220,7 +213,7 @@ of this article.
220213

221214
### Range queries
222215

223-
Numerical indexing relies on BKD-Tree to accelerate the performance of range
216+
Numerical indexing relies on BKD tree to accelerate the performance of range
224217
queries. Considering our KDB tree, to query all points in the range x in [1,8]
225218
and y in [9,11], the engine does the following:
226219

@@ -233,67 +226,88 @@ and y in [9,11], the engine does the following:
233226
- All child nodes of the right subtree satisfy our query range and zero child
234227
nodes from the left subtree. Finally, the query output is: {7,11} and {8,9}.
235228

236-
## Doc Values
229+
## Fast sorting and aggregations
237230

238-
Until Lucene 4.0 columns were indexed using an inverted index data structure
239-
that maps terms to document ids. For searching documents by terms, this is a
240-
very good solution. However, if we have to find field values given document id,
241-
this solution was not equally effective. Furthermore, to perform column-oriented
231+
### Document fields
232+
233+
Before Lucene 4.0, inverted indexes efficiently mapped terms to document ids
234+
but struggled with reverse lookups (document id → field value) and
235+
column-oriented retrieval. Doc values, introduced in Lucene 4.0, address
236+
this by storing field values in a column-stride format at index time,
237+
optimizing aggregations, sorting, and field access.
238+
239+
Lucene's stored document
240+
fields store all field values for one document together in a
241+
row-stride fashion, and are therefore relatively slow to access.
242+
243+
To perform column-oriented
242244
retrieval of data, it was necessary to traverse and extract all fields that
243245
appear in the collection of documents. This can cause memory and performance
244-
issues if we need to extract a large amount of data.
246+
issues when extracting a large amount of data from an inverted index.
247+
248+
### Doc values
249+
250+
Doc values store data column-stride (per field), unlike stored fields which
251+
are row-stride (per document), enabling faster field-specific access,
252+
and provide fast sorting and aggregations.
245253

246-
To improve the performance of aggregations and sorting, a new data structure was
247-
introduced, namely Doc Values. Doc Values is a column-based data storage built
248-
at document index time. They store all field values that are not analyzed as
249-
strings in a compact column making it more effective for sorting and
250-
aggregations.
254+
Doc values is a column-based data storage built at document index time.
255+
They store all field values that are not analyzed as strings in a compact
256+
column, making it more effective for sorting and aggregations.
251257

252-
CrateDB implements Column Store based on Doc Values in Lucene. The Column Store
253-
is created for each field in a document and generated as the following
254-
structures for fields in the Product table:
258+
Because Lucene’s inverted index data structure implementation is not
259+
optimal for finding field values by given document identifier, and for
260+
performing column-oriented retrieval of data, the doc values data
261+
structure is used for those purposes instead.
262+
263+
Doc values allow storing numerics and timestamps (single-valued or
264+
arrays), keywords (single-valued or arrays) and binary data per row.
265+
These values are quite fast to access at search time, since they are
266+
stored column-stride such that only the value for that one field needs
267+
to be decoded per row searched.
268+
269+
:::{seealso}
270+
-- [Document values with Apache Lucene]
271+
:::
272+
273+
### Column store
274+
275+
CrateDB implements a {ref}`column store <crate-reference:ddl-storage-columnstore>`
276+
based on doc values in Lucene. Using the *Product* table example:
255277

256278
| | **Document 1** | **Document 2** | **Document 3** |
257279
| --------- | -------------- | -------------- | -------------- |
258280
| productID | 1 | 2 | 3 |
259281
| name | Almond Milk | Almond Flour | Milk |
260282
| quantity | 100 | 200 | 300 |
261283

284+
Each field's values are stored contiguously in a column store (e.g.,
285+
all `productID` values: 1, 2, 3), enabling efficient column-based operations.
286+
262287
For example, for the first document, CrateDB creates the following mappings as
263-
Column Store: {productID → 1, name → “Almond Milk“, quantity → 100}.
264-
265-
Column Store significantly improves aggregations and grouping as the data for
266-
one column is packed in one place. Instead of traversing each document and
267-
fetching values of the field that can also be very scattered, we extract all
268-
field data from the existing Column Store. This approach significantly improves
269-
the performance of sorting, grouping, and aggregation operations. In CrateDB,
270-
Column Store is enabled by default and can be disabled only for text fields, not
271-
for other primitive types. Furthermore, CrateDB does not support storing values
272-
for {ref}`container <container>` and {ref}`geographic <geospatial>` data types
273-
in Column Store.
274-
275-
Besides fields, CrateDB also supports Column Store for the JSON representation
276-
of each row in a table. For our example, row-based Column Store is generated as
288+
a column store: {productID → 1, name → “Almond Milk“, quantity → 100}.
289+
290+
This storage layout improves sorting, grouping, and aggregations by keeping field
291+
data together rather than scattered across documents. The column store is enabled
292+
by default in CrateDB and can be disabled only for text fields. It does not support
293+
{ref}`container <container>` or {ref}`geographic <geospatial>` data types.
294+
295+
Besides fields, CrateDB also supports the column store for the JSON representation
296+
of each row in a table. For this example, the row-based column store is generated as
277297
the following:
278298

279299
| **Document** | **Row** |
280300
| ------------ | ----------------------------------------------- |
281-
| 1 | {“id“:1, name“:”Almond Milk”, “quantity:100} |
282-
| 2 | {“id“:2, name“:”Almond Flour”, “quantity:200} |
283-
| 3 | {“id“:3, name“:”Milk”, “quantity:300} |
301+
| 1 | {"id":1, "name":"Almond Milk", "quantity":100} |
302+
| 2 | {"id":2, "name":"Almond Flour", "quantity":200} |
303+
| 3 | {"id":3, "name":"Milk", "quantity":300} |
284304

285-
The use of Column Store results in a small disk footprint, thanks to specialized
305+
The use of a column store results in a small disk footprint, thanks to specialized
286306
compression algorithms such as delta encoding, bit packing, and GCD.
287307

288-
## Summary
289-
290-
This article describes the core design principles of the storage layer in
291-
CrateDB. Being based on the Lucene index, it enables effective and efficient
292-
search over the arbitrary size of documents with an arbitrary number of fields.
293-
294308
Besides inverted indexes, the Lucene indexing strategy also relies on BKD trees
295309
and Doc Values that are successfully adopted by CrateDB as well as many popular
296310
search engines. With a better understanding of the storage layer, we move to
297311
another interesting topic: [Handling Dynamic Objects in CrateDB].
298312

299-
[Handling Dynamic Objects in CrateDB]: https://cratedb.com/blog/handling-dynamic-objects-in-cratedb
313+
[Document values with Apache Lucene]: https://www.elastic.co/blog/sparse-versus-dense-document-values-with-apache-lucene

0 commit comments

Comments
 (0)