-
Notifications
You must be signed in to change notification settings - Fork 104
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add lexer to handle query parsing for hints
This commit adds a lexer to pg_hint_plan to properly parse the hints from query strings. This implementation is based on PostgreSQL's psqlscan.l, which is largely simplified and made pluggable for the sake of this module as it mostly cares about detecting comments and hints in queries. This has the advantage of removing all the custom code that pg_hint_plan has invented to parse the query strings, cleaning up some confusing historical behaviors in the module: - A qual value could be considered as a valid hint, like: SELECT * FROM tab WHERE col = '/* IndexScan (tab) */'; This is now parsed as a value, as it should. - The parsing of the comments was sometimes incorrect, failing to ignore comments that would make hints invalid. For example: SELECT /* IndexScan (tab /* internal_comment */ ) */ FROM tab; Any comments internal to the hints are now automatically discarded. Attempting to use nested comments generates INFO messages each time something fishy is found. The contents of the hints behave mostly the same way as before this commit, with their custom parsing. An exception is with nested comments, which are now simply ignored in the middle of hints, hints actually used if these are valid. This makes the manipulation of the StringInfo saving the hint strings more straight-forward. A couple of regression tests still need to be adjusted a bit, this is left as work for later. It could make sense to use a second layer with yyac to build the binary tree representation of the hints from their strings, but this is left as future work. A couple of regression tests need to have their output adjusted as now doublons of recursive comments can be fully detected at their correct position because yyac is much better at this job than the previous custom parser. I have benchmarked this change with what would be a kind of worse-case scenario, comparing HEAD and this new query parser: $ pgbench -f select.sql -n -c 1 -T TIME_IN_SECS $ cat select.sql: SELECT /*+SeqScan(a)*/ 1; PostgreSQL had pg_stat_statements and pg_hint_plan loaded in shared_preload_libraries, with syslogger doing a minimal amount of activity: shared_preload_libraries = 'pg_stat_statements,pg_hint_plan' fsync = off log_statement = 'none' log_min_messages = fatal CFLAGS used -O2, without --enable-cassert, and I did not see a difference between the patch and HEAD in terms of yyac, for both TPS and in terms of perf profiles. The memory footprint is stable across the same backend, with hints from comments still copied to TopMemoryContext (no change here). A difference is that yyac does *not* run in TopMemoryContext, only the hint string is pstrdup()'d in it, *if any*. No backpatch of this feature is planned in v16 and older stable branches, so this will be a v17~ only thing. Per pull request #138.
- Loading branch information
Showing
11 changed files
with
1,438 additions
and
96 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
/*------------------------------------------------------------------------- | ||
* | ||
* query_scan.h | ||
* lexical scanner for SQL commands | ||
* | ||
* This lexer can be used to extra hints from query contents, taking into | ||
* account what the backend would consider as values, for example. | ||
* | ||
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group | ||
* Portions Copyright (c) 1994, Regents of the University of California | ||
* | ||
* query_scan.h | ||
* | ||
*------------------------------------------------------------------------- | ||
*/ | ||
#ifndef QUERY_SCAN_H | ||
#define QUERY_SCAN_H | ||
|
||
#include "lib/stringinfo.h" | ||
|
||
/* Abstract type for lexer's internal state */ | ||
typedef struct QueryScanStateData *QueryScanState; | ||
|
||
/* Termination states for query_scan() */ | ||
typedef enum | ||
{ | ||
QUERY_SCAN_INCOMPLETE, /* end of line, SQL statement incomplete */ | ||
QUERY_SCAN_EOL /* end of line, SQL possibly complete */ | ||
} QueryScanResult; | ||
|
||
extern QueryScanState query_scan_create(void); | ||
extern void query_scan_setup(QueryScanState state, | ||
const char *line, int line_len, | ||
int encoding, bool std_strings, | ||
int elevel); | ||
extern void query_scan_finish(QueryScanState state); | ||
extern QueryScanResult query_scan(QueryScanState state, | ||
StringInfo query_buf); | ||
|
||
#endif /* QUERY_SCAN_H */ |
Oops, something went wrong.