Skip to content

Commit

Permalink
Integrate query IDs with the hint table
Browse files Browse the repository at this point in the history
The hint table is reworked so as the normalized query string is replaced
by the query ID generated by PostgreSQL, based on the jumbling done by
the backend.  This has multiple advantages:
- Less bloat in the hint table, as query strings are replaced by bigint
to track the query ID associated with a hint.
- Less work for the module itself.  Query normalization consumes cycles
to apply the constants to the string for the match in the hint table.
The longer the string, the longer it takes to apply the normalization.
This code was copy-pasted from pg_stat_statements, and there is no need
for it anymore.  This also means less long-term technical debt to keep
this code in line with upstream.
- Less error-prone, as a type in the data inserted in the hint table
could easily cause a mismatch fail when looking for an entry.

Regression tests are adjusted to cope with this commit, where a PL/PgSQL
function is added, to be able to retrieve the query ID from a query
string, wrapped around EXPLAIN (VERBOSE, JSON FORMAT).  The query IDs
can vary across platforms, so these are hidden to keep the tests
portable.  The following tweaks are done to the regression tests:
- pg_hint_plan.sql has been switched to use this wrapper, with no
changes in the regression test coverage.
- ut-A, similarly, switches to use the function to retrieve query IDs.
A use-case was becoming useless, where we checked for two hints set for
the same query ID computed.
- ut-T included two tests with CTAS and DECLARE, which interact badly
with the PL wrapper.  These could be covered with pg_stat_statements,
but this dependency is planned to be removed entirely, and they covered
edge cases with limited value at the end.

The documentation is updated, and the code in charge of the query
normalization that was inherited from pg_stat_statements is removed,
shaving a good chunk of code.  On version upgrade, the past table is
removed, replaced by the new one.  Using the hint table requires an
upgrade up to 1.7.0.

Per pull request #193.
  • Loading branch information
michaelpq committed May 20, 2024
1 parent d292747 commit 32ced2e
Show file tree
Hide file tree
Showing 17 changed files with 171 additions and 529 deletions.
5 changes: 2 additions & 3 deletions COPYRIGHT.postgresql
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
core.c and make_join_rel.c and pg_stat_statements.c are the parts of
PostgreSQL Database Management System.
(formerly known as Postgres, then as Postgres95)
core.c and make_join_rel.c are the parts of PostgreSQL Database Management
System (formerly known as Postgres, then as Postgres95).
Copyright holders of those files are following organizations:

Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ TARSOURCES = Makefile *.c *.h COPYRIGHT* \
rpms: rpm17

# pg_hint_plan.c includes core.c and make_join_rel.c
pg_hint_plan.o: core.c make_join_rel.c # pg_stat_statements.c
pg_hint_plan.o: core.c make_join_rel.c

$(STARBALLS): $(TARSOURCES)
if [ -h $(subst .tar.gz,,$@) ]; then rm $(subst .tar.gz,,$@); fi
Expand Down
20 changes: 14 additions & 6 deletions docs/hint_table.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,23 @@ table named `"hint_plan.hints"`. The table consists of the following columns:
| column | description |
|:-------|:------------|
| `id` | Unique number to identify a row for a hint. <br>This column is filled automatically by sequence. |
| `norm_query_string` | A pattern matching with the query to be hinted. <br>Constants in the query are replaced by '?' as in the following example. |
| `query_id` | A unique query ID, generated by the backend when the GUC compute_query_id is enabled |
| `application_name` | The value of `application_name` where sessions can apply a hint. <br>The hint in the example below applies to sessions connected from psql. <br>An empty string implies that all sessions will apply the hint. |
| `hints` | Hint phrase. <br>This must be a series of hints excluding surrounding comment marks. |

The following example shows how to operate with the hint table.

```sql
=# INSERT INTO hint_plan.hints(norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM t1 WHERE t1.id = ?;',
'',
'SeqScan(t1)');
=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM t1 WHERE t1.id = 1;
QUERY PLAN
----------------------------------------
Seq Scan on public.t1
Output: id, id2
Filter: (t1.id = 1)
Query Identifier: -7164653396197960701
(4 rows)
=# INSERT INTO hint_plan.hints(query_id, application_name, hints)
VALUES (-7164653396197960701, '', 'SeqScan(t1)');
INSERT 0 1
=# UPDATE hint_plan.hints
SET hints = 'IndexScan(t1)'
Expand All @@ -32,6 +37,9 @@ The hint table is owned by the extension owner and has the same default
privileges as of the time of its creation, during `CREATE EXTENSION`.
Hints in the hint table are prioritized over hints in comments.

The query ID can be retrieved with `pg_stat_statements` or with
`EXPLAIN (VERBOSE)`.

## Types of hints

Hinting phrases are classified in multiple types based on what kind of object
Expand Down
21 changes: 21 additions & 0 deletions expected/init.out
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,27 @@ CREATE VIEW v1 AS SELECT id, val FROM t1;
CREATE VIEW v2 AS SELECT t1.id t1_id, t1.val t1_val, t2.id t2_id, t2.val t2_val FROM t1, t2 WHERE t1.id = t2.id;
CREATE VIEW v3 AS SELECT t_1.id t1_id, t_1.val t1_val, t_2.id t2_id, t_2.val t2_val FROM t1 t_1, t2 t_2 WHERE t_1.id = t_2.id;
CREATE VIEW v4 AS SELECT v_2.t1_id, t_3.id FROM v2 v_2, t3 t_3 WHERE v_2.t1_id = t_3.id;
/*
* Utility function to retrieve a query ID from a query.
*
* This wraps the input query within an EXPLAIN (VERBOSE, FORMAT json) and
* returns its query ID. This removes the need to use pg_stat_statements
* while minimizing the number of queries executed.
*/
CREATE FUNCTION get_query_id(text) RETURNS bigint
LANGUAGE plpgsql AS
$$
DECLARE
query text;
explain_output text;
query_id bigint;
BEGIN
query = 'EXPLAIN (VERBOSE, FORMAT json) ' || $1;
EXECUTE query INTO explain_output;
SELECT INTO query_id ((explain_output::jsonb)->0->'Query Identifier')::bigint;
return query_id;
END;
$$;
/*
* The following GUC parameters need the setting of the default value to
* succeed in regression test.
Expand Down
12 changes: 12 additions & 0 deletions expected/oldextversions.out
Original file line number Diff line number Diff line change
Expand Up @@ -188,4 +188,16 @@ Objects in extension "pg_hint_plan"
type hint_plan.hints[]
(4 rows)

\d hint_plan.hints
Table "hint_plan.hints"
Column | Type | Collation | Nullable | Default
------------------+---------+-----------+----------+----------------------------------
id | integer | | not null | generated by default as identity
query_id | bigint | | not null |
application_name | text | | not null |
hints | text | | not null |
Indexes:
"hints_pkey" PRIMARY KEY, btree (id)
"hints_id_and_app" UNIQUE, btree (query_id, application_name)

DROP EXTENSION pg_hint_plan;
19 changes: 7 additions & 12 deletions expected/pg_hint_plan.out
Original file line number Diff line number Diff line change
Expand Up @@ -8036,18 +8036,13 @@ error hint:
Filter: (val = 1)
(19 rows)

-- search from hint table
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints) VALUES ('EXPLAIN (COSTS false) SELECT * FROM t1 WHERE t1.id = ?;', '', 'SeqScan(t1)');
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints) VALUES ('EXPLAIN (COSTS false) SELECT id FROM t1 WHERE t1.id = ?;', '', 'IndexScan(t1)');
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints) VALUES ('EXPLAIN SELECT * FROM t1 WHERE t1.id = ?;', '', 'BitmapScan(t1)');
SELECT * FROM hint_plan.hints ORDER BY id;
id | norm_query_string | application_name | hints
----+----------------------------------------------------------+------------------+----------------
1 | EXPLAIN (COSTS false) SELECT * FROM t1 WHERE t1.id = ?; | | SeqScan(t1)
2 | EXPLAIN (COSTS false) SELECT id FROM t1 WHERE t1.id = ?; | | IndexScan(t1)
3 | EXPLAIN SELECT * FROM t1 WHERE t1.id = ?; | | BitmapScan(t1)
(3 rows)

-- Search from hint table
SELECT get_query_id('SELECT * FROM t1 WHERE t1.id = 1') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
SELECT get_query_id('SELECT id FROM t1 WHERE t1.id = 1') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'IndexScan(t1)');
SET pg_hint_plan.enable_hint_table = on;
EXPLAIN (COSTS false) SELECT * FROM t1 WHERE t1.id = 1;
LOG: pg_hint_plan:
Expand Down
90 changes: 28 additions & 62 deletions expected/ut-A.out
Original file line number Diff line number Diff line change
Expand Up @@ -149,16 +149,16 @@ error hint:
SET pg_hint_plan.enable_hint_table TO on;
-- No. A-6-1-1
\d hint_plan.hints
Table "hint_plan.hints"
Column | Type | Collation | Nullable | Default
-------------------+---------+-----------+----------+---------------------------------------------
id | integer | | not null | nextval('hint_plan.hints_id_seq'::regclass)
norm_query_string | text | | not null |
application_name | text | | not null |
hints | text | | not null |
Table "hint_plan.hints"
Column | Type | Collation | Nullable | Default
------------------+---------+-----------+----------+----------------------------------
id | integer | | not null | generated by default as identity
query_id | bigint | | not null |
application_name | text | | not null |
hints | text | | not null |
Indexes:
"hints_pkey" PRIMARY KEY, btree (id)
"hints_norm_and_app" UNIQUE, btree (norm_query_string, application_name)
"hints_id_and_app" UNIQUE, btree (query_id, application_name)

----
---- No. A-6-2 search condition
Expand All @@ -171,11 +171,9 @@ EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
(2 rows)

-- No. A-6-2-1
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ?;',
'',
'SeqScan(t1)');
SELECT get_query_id('SELECT * FROM s1.t1 WHERE t1.c1 = 1;') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
LOG: pg_hint_plan:
used hint:
Expand All @@ -191,11 +189,8 @@ error hint:
(2 rows)

-- No. A-6-2-2
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ?;',
'psql',
'BitmapScan(t1)');
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', 'dummy_application_name', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
LOG: pg_hint_plan:
used hint:
Expand All @@ -212,27 +207,9 @@ error hint:

TRUNCATE hint_plan.hints;
-- No. A-6-2-3
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ?;',
'dummy_application_name',
'SeqScan(t1)'
);
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
QUERY PLAN
------------------------------
Index Scan using t1_i1 on t1
Index Cond: (c1 = 1)
(2 rows)

TRUNCATE hint_plan.hints;
-- No. A-6-2-4
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1;',
'',
'SeqScan(t1)'
);
SELECT get_query_id('SELECT * FROM s1.t1;') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
QUERY PLAN
------------------------------
Expand All @@ -245,12 +222,9 @@ TRUNCATE hint_plan.hints;
---- No. A-6-3 number of constant
----
-- No. A-6-3-1
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT c1 FROM s1.t1;',
'',
'SeqScan(t1)'
);
SELECT get_query_id('SELECT c1 FROM s1.t1;') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT c1 FROM s1.t1;
LOG: pg_hint_plan:
used hint:
Expand All @@ -266,12 +240,9 @@ error hint:

TRUNCATE hint_plan.hints;
-- No. A-6-3-2
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ?;',
'',
'SeqScan(t1)'
);
SELECT get_query_id('SELECT * FROM s1.t1 WHERE t1.c1 = 1;') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1;
LOG: pg_hint_plan:
used hint:
Expand All @@ -288,12 +259,9 @@ error hint:

TRUNCATE hint_plan.hints;
-- No. A-6-3-3
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ? OR t1.c1 = ?;',
'',
'SeqScan(t1)'
);
SELECT get_query_id('SELECT * FROM s1.t1 WHERE t1.c1 = 1 OR t1.c1 = 2') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = 1 OR t1.c1 = 0;
LOG: pg_hint_plan:
used hint:
Expand Down Expand Up @@ -2116,11 +2084,9 @@ SHOW pg_hint_plan.parse_messages;
----
---- No. A-8-5 original GUC parameter pg_hint_plan.enable_hint_table
----
INSERT INTO hint_plan.hints (norm_query_string, application_name, hints)
VALUES (
'EXPLAIN (COSTS false) SELECT * FROM s1.t1 WHERE t1.c1 = ?;',
'',
'SeqScan(t1)');
SELECT get_query_id('SELECT * FROM s1.t1 WHERE t1.c1 = 1;') AS query_id \gset
INSERT INTO hint_plan.hints (query_id, application_name, hints)
VALUES (:'query_id', '', 'SeqScan(t1)');
-- No. A-8-5-1
SET pg_hint_plan.enable_hint_table TO on;
SHOW pg_hint_plan.enable_hint_table;
Expand Down
58 changes: 4 additions & 54 deletions expected/ut-T.out
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@ SET pg_hint_plan.enable_hint TO on;
SET pg_hint_plan.debug_print TO on;
SET client_min_messages TO LOG;
SET search_path TO public;
-- test for get_query_string
INSERT INTO hint_plan.hints VALUES(DEFAULT,'EXPLAIN (COSTS false) SELECT * FROM t1 WHERE id = ?;', '', 'SeqScan(t1)');
INSERT INTO hint_plan.hints VALUES(DEFAULT,'PREPARE p1 AS SELECT * FROM t1 WHERE id = ?;', '', 'SeqScan(t1)');
INSERT INTO hint_plan.hints VALUES(DEFAULT,'EXPLAIN (COSTS false) DECLARE c1 CURSOR FOR SELECT * FROM t1 WHERE id = ?;', '', 'SeqScan(t1)');
INSERT INTO hint_plan.hints VALUES(DEFAULT,'EXPLAIN (COSTS false) CREATE TABLE ct1 AS SELECT * FROM t1 WHERE id = ?;', '', 'SeqScan(t1)');
-- This hint affects queries with an equivalent query ID when executed as
-- a subquery.
SELECT get_query_id('SELECT * FROM t1 WHERE id = 1;') AS query_id \gset
INSERT INTO hint_plan.hints VALUES (DEFAULT, :'query_id', '', 'SeqScan(t1)');
PREPARE p1 AS SELECT * FROM t1 WHERE id = 100;
-- These queries uses IndexScan without hints
SET pg_hint_plan.enable_hint_table to off;
Expand All @@ -20,20 +19,6 @@ EXPLAIN (COSTS false) SELECT * FROM t1 WHERE id = 100;
Index Cond: (id = 100)
(2 rows)

EXPLAIN (COSTS false) DECLARE c1 CURSOR FOR SELECT * FROM t1 WHERE id = 100;
QUERY PLAN
--------------------------------
Index Scan using t1_pkey on t1
Index Cond: (id = 100)
(2 rows)

EXPLAIN (COSTS false) CREATE TABLE ct1 AS SELECT * FROM t1 WHERE id = 100;
QUERY PLAN
--------------------------------
Index Scan using t1_pkey on t1
Index Cond: (id = 100)
(2 rows)

EXPLAIN (COSTS false) EXECUTE p1;
QUERY PLAN
--------------------------------
Expand Down Expand Up @@ -68,34 +53,6 @@ error hint:
Filter: (id = 100)
(2 rows)

EXPLAIN (COSTS false) DECLARE c1 CURSOR FOR SELECT * FROM t1 WHERE id = 100;
LOG: pg_hint_plan:
used hint:
SeqScan(t1)
not used hint:
duplication hint:
error hint:

QUERY PLAN
----------------------
Seq Scan on t1
Filter: (id = 100)
(2 rows)

EXPLAIN (COSTS false) CREATE TABLE ct1 AS SELECT * FROM t1 WHERE id = 100;
LOG: pg_hint_plan:
used hint:
SeqScan(t1)
not used hint:
duplication hint:
error hint:

QUERY PLAN
----------------------
Seq Scan on t1
Filter: (id = 100)
(2 rows)

EXPLAIN (COSTS false) EXECUTE p1;
LOG: pg_hint_plan:
used hint:
Expand Down Expand Up @@ -127,12 +84,5 @@ error hint:
(2 rows)

DEALLOCATE p1;
-- Check proper calling to generate_normalized_query
\;\;SELECT 1,2;
?column? | ?column?
----------+----------
1 | 2
(1 row)

SET pg_hint_plan.enable_hint_table to off;
DELETE FROM hint_plan.hints;
17 changes: 0 additions & 17 deletions normalize_query.h

This file was deleted.

Loading

0 comments on commit 32ced2e

Please sign in to comment.