-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
validate-index: Implement a function to validate index data structures (
#208) * validate-index: Implement a function to validate index data structures Example: ``` CREATE EXTENSION lantern; CREATE TABLE small_world ( id SERIAL PRIMARY KEY, v REAL[2] ); INSERT INTO small_world (v) VALUES ('{0,0,1}'), ('{0,1,0}'); CREATE INDEX ON small_world USING hnsw (v); SELECT _lantern_internal.validate_index('small_world_v_idx'); ``` The output of the last command: ``` INFO: validate_index() start for small_world_v_idx INFO: index_header = HnswIndexHeaderPage(version=1 vector_dim=3 m=16 ef_construction=128 ef=64 metric_kind=1 num_vectors=2 last_data_block=2 blockmap_page_groups=0) INFO: blocks_nr=3 nodes_nr=2 INFO: blocks for: header 1 blockmap 1 nodes 1 INFO: nodes per block: last block 2 INFO: level=0: nodes 2 directed neighbor edges 2 min neighbors 1 max neighbors 1 INFO: validate_index() done, no issues found. validate_index ---------------- (1 row) ``` To see the indexes that could be passed to the function: ``` postgres=# \d small_world; Table "public.small_world" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+----------------------------------------- id | integer | | not null | nextval('small_world_id_seq'::regclass) v | real[] | | | Indexes: "small_world_pkey" PRIMARY KEY, btree (id) "small_world_v_idx" hnsw (v) ``` This patch also adds the validate_index() call to existing tests. Because of use of RNG in hnsw_generate_new_level() the number of levels in the newly INSERTed nodes is not deterministic, and validate_index() output may change between runs, because it prints the number of nodes for each level. If you see a sporadic test failures due to different validate_index() info output please remove the validate_index() call from the test. Another solution would be to add an option validate_index() to tell if elog() for the additional info is needed. * src/hnsw/validate_index: run clang-format * src/hnsw/validate_index: use signed batch_size and group_node_first_index They are compared and are used in the same expressions as other unsigned variables anyway. There is no good reason for them to be signed. * src/hnsw/validate_index: change PRIu64 to ul Reference: https://gitlab.com/wireshark/wireshark/-/issues/17895 * src/hnsw/validate_index: remove dangling " " after clang-format * src/hnsw/validate_index: include access/heapam.h instead of access/relation.h for PostgreSQL 11 * src/hnsw/validate_index: clang-format * src/hnsw/validate_index: make elog(INFO, ...) prints optional and enabled by default This is required because some tests are building the HNSW index in a non-deterministic way. * test: make validate_index() output deterministic * src/hnsw/validate_index: use ldb_invariant() instead of assert() * src/hnsw/validate_index: reduce the scope of what's done in LDB_VI_READ_NODE_CHUNK() macro * src/hnsw/validate_index: validate vn_dim properly * src/hnsw/validate_index: add a comment about assumptions and storage format for struct ldb_vi_node * src/hnsw/validate_index: describe what vi here is * src/hnsw/validate_index: cast ldb_HnswGetM() to uint32 to compare with HnswIndexHeaderPage.m * use FirstOffsetNumber and OffsetNumberNext() in the loop over page
- Loading branch information
Showing
26 changed files
with
1,003 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#ifndef LDB_HNSW_VALIDATE_INDEX_H | ||
#define LDB_HNSW_VALIDATE_INDEX_H | ||
|
||
#include <postgres.h> | ||
|
||
/* | ||
* This function checks integrity of the data structures in the index relation. | ||
* | ||
* How it works: | ||
* - it creates ldb_vi_block for each block of the index relation and | ||
* ldb_vi_node for each node inside the index relation; | ||
* - it loads all blockmap groups and analyzes mappings between nodes and | ||
* blocks; | ||
* - it loads all the nodes with their neighbors; | ||
* - it also prints statistics about blocks and nodes, which is useful for | ||
* understanding of what's inside the index; | ||
* - it assumes that PostgreSQL-level data structures are intact (i.e. the page | ||
* header and the mapping between offsets and items is correct for each page); | ||
* - in case if a corruption of the data structure is found the function prints | ||
* an error message with details about the place and surrounding data | ||
* structures. | ||
*/ | ||
void ldb_validate_index(Oid indrelid, bool print_info); | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.