@@ -17,40 +17,28 @@ class-container: sd-p-2 sd-outline-muted sd-rounded-1
1717
1818## Introduction
1919
20- In this article series, we look at CrateDB from different perspectives. We start
21- from the bottom of CrateDB architecture and gradually move up to higher layers,
22- presenting the most important aspects of CrateDB internals. The motivation is to
23- better understand CrateDB, as well as to aid users in maximizing the
24- effectiveness of CrateDB features.
25-
26- In the first part, we explore the internal workings of the storage layer in
27- CrateDB. The storage layer ensures that data is stored in a safe and accurate
28- way and returned completely and efficiently. The CrateDB storage layer is based
29- on Lucene indexes. Lucene offers scalable and high-performance indexing which
30- enables efficient search and aggregations over documents and rapid updates to
31- the existing documents. We will look at the three main Lucene structures that
32- are used within CrateDB: Inverted Indexes for text values, BKD-Trees for numeric
33- values, and Doc Values.
20+ This article explores the internal workings of the storage layer in CrateDB.
21+ The storage layer ensures that data is stored in a safe and accurate
22+ way and returned completely and efficiently.
23+ CrateDB's storage layer is based on Lucene indexes.
3424
3525## What's inside
3626
37- This article explores the internal workings of the storage layer in CrateDB,
38- with a focus on Lucene's indexing strategies.
27+ Lucene offers scalable and high-performance indexing, which enables efficient
28+ search and aggregations over documents and rapid updates to the existing
29+ documents. We will look at the three main Lucene structures that are used
30+ within CrateDB: Inverted indexes for text values, BKD trees for numeric
31+ values, and doc values.
3932
40- The CrateDB storage layer is based on Lucene indexes. Lucene offers scalable and
41- high-performance indexing which enables efficient search and aggregations over
42- documents and rapid updates to the existing documents. We will look at the three
43- main Lucene structures that are used within CrateDB: Inverted Indexes for text
44- values, BKD-Trees for numeric values, and Doc Values.
33+ : Inverted index: Understand how inverted indexes are implemented in Lucene
34+ and how CrateDB uses them to index text values and enable fast text searches.
4535
46- : Inverted Index: You will learn how inverted indexes are implemented in Lucene
47- and CrateDB, and how they are used for indexing text values.
48-
49- : BKD Tree: Better understand the BKD tree, starting from KD trees, and how this
36+ : BKD tree: Understand the BKD tree, starting from KD trees, and how this
5037data structure supports range queries on numeric values in CrateDB.
5138
52- : Doc Values: This data structure supports more efficient querying document
53- fields by id, performs column-oriented retrieval of data, and improves the
39+ : Doc values: This data structure
40+ enables efficient queries by document field name,
41+ performs column-oriented retrieval of data, and improves the
5442performance of aggregation and sorting operations.
5543
5644## Indexing text values
@@ -95,7 +83,9 @@ example:
9583The diagram below shows the indexing terms from two documents, the sorted
9684sequence, and finally the index.
9785
86+ :::{div} sd-bg-white
9887![ Indexing terms sorted sequence and index] ( https://crate.io/hubfs/Sequence-of-terms-Sorted-Sequence-Index.png )
88+ :::
9989
10090### Lucene segments
10191
@@ -122,9 +112,8 @@ separately.
122112To illustrate both indexing methods, let’s consider a simple table called
123113* Product* :
124114
125- | | | |
126- | ------------- | ------------ | ------------ |
127115| ** productID** | ** name** | ** quantity** |
116+ | ------------- | ------------ | ------------ |
128117| 1 | Almond Milk | 100 |
129118| 2 | Almond Flour | 200 |
130119| 3 | Milk | 300 |
@@ -166,10 +155,12 @@ Lucene 6.0 adds an implementation of Block KD (BKD) tree data structure.
166155
167156### BKD tree
168157
169- To better understand the BKD tree data structure, let’s start with a short
158+ To better understand the BKD tree data structure, let's begin with an
170159introduction to KD trees. A KD tree is a binary tree for multidimensional
171160queries. KD tree shares the same properties as binary search trees (BST), but
172- the dimensions alternate for each level of the tree. For instance, starting from
161+ the dimensions alternate for each level of the tree.
162+
163+ For instance, starting from
173164the root node, the x value of the left nodes is always less than the x value of
174165the root node. The same applies to the right node and all intermediate nodes up
175166to leaf nodes. KDB tree is a special kind of KD tree with properties found in
@@ -211,7 +202,9 @@ construction process is as follows:
211202 construction process stops. Finally, the KDB tree is constructed as
212203 illustrated in the figure below:
213204
205+ :::{div} sd-bg-white
214206![ KDB tree divided by y dimension] ( https://crate.io/hubfs/divded-by-y-dimension.png )
207+ :::
215208
216209The index file with the resulting data structure is then created as a series of
217210blocks that contain data from leaf nodes, intermediate nodes, and the metadata
@@ -220,7 +213,7 @@ of this article.
220213
221214### Range queries
222215
223- Numerical indexing relies on BKD-Tree to accelerate the performance of range
216+ Numerical indexing relies on BKD tree to accelerate the performance of range
224217queries. Considering our KDB tree, to query all points in the range x in [ 1,8]
225218and y in [ 9,11] , the engine does the following:
226219
@@ -233,67 +226,88 @@ and y in [9,11], the engine does the following:
233226- All child nodes of the right subtree satisfy our query range and zero child
234227 nodes from the left subtree. Finally, the query output is: {7,11} and {8,9}.
235228
236- ## Doc Values
229+ ## Fast sorting and aggregations
237230
238- Until Lucene 4.0 columns were indexed using an inverted index data structure
239- that maps terms to document ids. For searching documents by terms, this is a
240- very good solution. However, if we have to find field values given document id,
241- this solution was not equally effective. Furthermore, to perform column-oriented
231+ ### Document fields
232+
233+ Before Lucene 4.0, inverted indexes efficiently mapped terms to document ids
234+ but struggled with reverse lookups (document id → field value) and
235+ column-oriented retrieval. Doc values, introduced in Lucene 4.0, address
236+ this by storing field values in a column-stride format at index time,
237+ optimizing aggregations, sorting, and field access.
238+
239+ Lucene's stored document
240+ fields store all field values for one document together in a
241+ row-stride fashion, and are therefore relatively slow to access.
242+
243+ To perform column-oriented
242244retrieval of data, it was necessary to traverse and extract all fields that
243245appear in the collection of documents. This can cause memory and performance
244- issues if we need to extract a large amount of data.
246+ issues when extracting a large amount of data from an inverted index.
247+
248+ ### Doc values
249+
250+ Doc values store data column-stride (per field), unlike stored fields which
251+ are row-stride (per document), enabling faster field-specific access,
252+ and provide fast sorting and aggregations.
245253
246- To improve the performance of aggregations and sorting, a new data structure was
247- introduced, namely Doc Values. Doc Values is a column-based data storage built
248- at document index time. They store all field values that are not analyzed as
249- strings in a compact column making it more effective for sorting and
250- aggregations.
254+ Doc values is a column-based data storage built at document index time.
255+ They store all field values that are not analyzed as strings in a compact
256+ column, making it more effective for sorting and aggregations.
251257
252- CrateDB implements Column Store based on Doc Values in Lucene. The Column Store
253- is created for each field in a document and generated as the following
254- structures for fields in the Product table:
258+ Because Lucene’s inverted index data structure implementation is not
259+ optimal for finding field values by given document identifier, and for
260+ performing column-oriented retrieval of data, the doc values data
261+ structure is used for those purposes instead.
262+
263+ Doc values allow storing numerics and timestamps (single-valued or
264+ arrays), keywords (single-valued or arrays) and binary data per row.
265+ These values are quite fast to access at search time, since they are
266+ stored column-stride such that only the value for that one field needs
267+ to be decoded per row searched.
268+
269+ :::{seealso}
270+ -- [ Document values with Apache Lucene]
271+ :::
272+
273+ ### Column store
274+
275+ CrateDB implements a {ref}` column store <crate-reference:ddl-storage-columnstore> `
276+ based on doc values in Lucene. Using the * Product* table example:
255277
256278| | ** Document 1** | ** Document 2** | ** Document 3** |
257279| --------- | -------------- | -------------- | -------------- |
258280| productID | 1 | 2 | 3 |
259281| name | Almond Milk | Almond Flour | Milk |
260282| quantity | 100 | 200 | 300 |
261283
284+ Each field's values are stored contiguously in a column store (e.g.,
285+ all ` productID ` values: 1, 2, 3), enabling efficient column-based operations.
286+
262287For example, for the first document, CrateDB creates the following mappings as
263- Column Store: {productID → 1, name → “Almond Milk“, quantity → 100}.
264-
265- Column Store significantly improves aggregations and grouping as the data for
266- one column is packed in one place. Instead of traversing each document and
267- fetching values of the field that can also be very scattered, we extract all
268- field data from the existing Column Store. This approach significantly improves
269- the performance of sorting, grouping, and aggregation operations. In CrateDB,
270- Column Store is enabled by default and can be disabled only for text fields, not
271- for other primitive types. Furthermore, CrateDB does not support storing values
272- for {ref}` container <container> ` and {ref}` geographic <geospatial> ` data types
273- in Column Store.
274-
275- Besides fields, CrateDB also supports Column Store for the JSON representation
276- of each row in a table. For our example, row-based Column Store is generated as
288+ a column store: {productID → 1, name → “Almond Milk“, quantity → 100}.
289+
290+ This storage layout improves sorting, grouping, and aggregations by keeping field
291+ data together rather than scattered across documents. The column store is enabled
292+ by default in CrateDB and can be disabled only for text fields. It does not support
293+ {ref}` container <container> ` or {ref}` geographic <geospatial> ` data types.
294+
295+ Besides fields, CrateDB also supports the column store for the JSON representation
296+ of each row in a table. For this example, the row-based column store is generated as
277297the following:
278298
279299| ** Document** | ** Row** |
280300| ------------ | ----------------------------------------------- |
281- | 1 | {“id“ :1, “ name“:” Almond Milk”, “ quantity“ :100} |
282- | 2 | {“id“ :2, “ name“:” Almond Flour”, “ quantity“ :200} |
283- | 3 | {“id“ :3, “ name“:” Milk”, “ quantity“ :300} |
301+ | 1 | {"id" :1, " name":" Almond Milk", " quantity" :100} |
302+ | 2 | {"id" :2, " name":" Almond Flour", " quantity" :200} |
303+ | 3 | {"id" :3, " name":" Milk", " quantity" :300} |
284304
285- The use of Column Store results in a small disk footprint, thanks to specialized
305+ The use of a column store results in a small disk footprint, thanks to specialized
286306compression algorithms such as delta encoding, bit packing, and GCD.
287307
288- ## Summary
289-
290- This article describes the core design principles of the storage layer in
291- CrateDB. Being based on the Lucene index, it enables effective and efficient
292- search over the arbitrary size of documents with an arbitrary number of fields.
293-
294308Besides inverted indexes, the Lucene indexing strategy also relies on BKD trees
295309and Doc Values that are successfully adopted by CrateDB as well as many popular
296310search engines. With a better understanding of the storage layer, we move to
297311another interesting topic: [ Handling Dynamic Objects in CrateDB] .
298312
299- [ Handling Dynamic Objects in CrateDB ] : https://cratedb.com /blog/handling-dynamic-objects-in-cratedb
313+ [ Document values with Apache Lucene ] : https://www.elastic.co /blog/sparse-versus-dense-document-values-with-apache-lucene
0 commit comments