Server crashes on executing explain after invalid hint #141

saygoodbyye · 2023-06-26T04:42:27Z

I have got a server crash when executing the following script:

CREATE EXTENSION pg_hint_plan;
CREATE TABLE t1 (id int PRIMARY KEY, val int);
CREATE TABLE t2 (id int PRIMARY KEY, val int);
LOAD 'pg_hint_plan'; 
SET pg_hint_plan.parse_messages TO error;
/*+Set(work_mem "1M")*/
EXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;
/*+Set(work_mem "1MB")*/
EXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;

Server configuration:

./configure CFLAGS=" -Og" --enable-tap-tests --enable-cassert --enable-debug

I have used this branches:
postgres - master
pg_hint_plan - master

Also I've reproduced this crash on branches:
postgres - REL_15_STABLE
pg_hint_plan - PG15

Here is the backtrace:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f25f963c406 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f25f962287c in __GI_abort () at ./stdlib/abort.c:79
#5  0x000055709bc9ea7b in ExceptionalCondition (conditionName=conditionName@entry=0x55709be009fd "IsPointerList(list)", fileName=fileName@entry=0x55709be2be0e "list.c", lineNumber=lineNumber@entry=496) at assert.c:66
#6  0x000055709ba09c9d in lcons (datum=datum@entry=0x55709cfbe798, list=0x55709d06f0f8) at list.c:496
#7  0x00007f25f99744be in push_hint (hstate=0x55709cfbe798) at pg_hint_plan.c:2831
#8  0x00007f25f99746de in pg_hint_plan_planner (parse=0x55709cf8c480, query_string=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", cursorOptions=2048, boundParams=0x0)
    at pg_hint_plan.c:3125
#9  0x000055709ba88b3e in planner (parse=parse@entry=0x55709cf8c480, query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:279
#10 0x000055709bb67207 in pg_plan_query (querytree=querytree@entry=0x55709cf8c480, query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:904
#11 0x000055709b91b6c2 in ExplainOneQuery (query=0x55709cf8c480, cursorOptions=cursorOptions@entry=2048, into=into@entry=0x0, es=es@entry=0x55709cfbe4f8, 
    queryString=queryString@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", params=params@entry=0x0, queryEnv=0x0) at explain.c:406
#12 0x000055709b91c103 in ExplainQuery (pstate=pstate@entry=0x55709cfb58c0, stmt=stmt@entry=0x55709cf8c2c0, params=params@entry=0x0, dest=dest@entry=0x55709cfb5830) at explain.c:290
#13 0x000055709bb6d4fa in standard_ProcessUtility (pstmt=0x55709cf8c370, queryString=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", readOnlyTree=<optimized out>, 
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55709cfb5830, qc=0x7ffc7d66d680) at utility.c:870
#14 0x000055709bb6d9c5 in ProcessUtility (pstmt=pstmt@entry=0x55709cf8c370, queryString=<optimized out>, readOnlyTree=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>, queryEnv=<optimized out>, 
    dest=0x55709cfb5830, qc=0x7ffc7d66d680) at utility.c:530
#15 0x000055709bb6ae8e in PortalRunUtility (portal=portal@entry=0x55709d010b68, pstmt=0x55709cf8c370, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x55709cfb5830, 
    qc=qc@entry=0x7ffc7d66d680) at pquery.c:1158
#16 0x000055709bb6b2da in FillPortalStore (portal=portal@entry=0x55709d010b68, isTopLevel=isTopLevel@entry=true) at pquery.c:1031
#17 0x000055709bb6b65d in PortalRun (portal=portal@entry=0x55709d010b68, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x55709d05f0f8, 
    altdest=altdest@entry=0x55709d05f0f8, qc=0x7ffc7d66d870) at pquery.c:763
#18 0x000055709bb6780f in exec_simple_query (query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;") at postgres.c:1274
#19 0x000055709bb69653 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4632
#20 0x000055709bacbc6b in BackendRun (port=port@entry=0x55709cfb8ca0) at postmaster.c:4461
#21 0x000055709bacdc98 in BackendStartup (port=port@entry=0x55709cfb8ca0) at postmaster.c:4189
#22 0x000055709bacde36 in ServerLoop () at postmaster.c:1779
#23 0x000055709bacf359 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x55709cf854f0) at postmaster.c:1463
#24 0x000055709b9f49a3 in main (argc=3, argv=0x55709cf854f0) at main.c:198

Best regards,
Egor Chindyaskin
Postgres Professional: http://postgrespro.com/

The text was updated successfully, but these errors were encountered:

Anisimov-ds · 2023-06-26T08:31:45Z

If pg_hint_plan.parse_messages level is set to ERROR, the setup_guc_enforcement() function breaks immediately when the SET hint has an error. In this case, the pop_hint() function is not called.

michaelpq · 2023-06-27T05:27:19Z

Thanks for the report and the patch! Will look into it!

jjune235 · 2023-06-28T01:42:30Z

I had the same problem with PG11

The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack available for a given query. push_hint() is in charge of updating the stack with a new set in the planner hook, and pop_hint() would reset the stack to its previous state. It happens that the code was not cleaning up the stack between push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to trigger ERRORs, corrupting the hint stack hold in a backend. Based on report #141, setting pg_hint_plan.parse_messages TO ERROR with an incorrect Set hint would be one case where the stack would get corrupted, because the GUCs defined in such hints are loaded after the hint stack update and before PG_TRY(). Any errors happening before PG_TRY() would be enough to corrupt the stack, like a palloc() failing on OOM, though this is very unlikely. This commit takes care of this bug by removing completely the window where the stack could get corrupted, by moving PG_TRY a little bit earlier. A regression test is added, provided by reporter. The patch is from me. Per report #141, from Egor Chindyaskin. Backpatch-through: 11

michaelpq · 2023-06-29T10:08:13Z

The issue is now fixed. I have reworked at bit the code around push/pop so as we handle the stack consistently now.

Anisimov-ds mentioned this issue Jun 26, 2023

Fix use-after-free when pg_hint_plan.parse_messages is ERROR #142

Closed

michaelpq closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server crashes on executing explain after invalid hint #141

Server crashes on executing explain after invalid hint #141

saygoodbyye commented Jun 26, 2023 •

edited

Loading

Anisimov-ds commented Jun 26, 2023

michaelpq commented Jun 27, 2023

jjune235 commented Jun 28, 2023

michaelpq commented Jun 29, 2023

Server crashes on executing explain after invalid hint #141

Server crashes on executing explain after invalid hint #141

Comments

saygoodbyye commented Jun 26, 2023 • edited Loading

Anisimov-ds commented Jun 26, 2023

michaelpq commented Jun 27, 2023

jjune235 commented Jun 28, 2023

michaelpq commented Jun 29, 2023

saygoodbyye commented Jun 26, 2023 •

edited

Loading