Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crashes on executing explain after invalid hint #141

Closed
saygoodbyye opened this issue Jun 26, 2023 · 4 comments
Closed

Server crashes on executing explain after invalid hint #141

saygoodbyye opened this issue Jun 26, 2023 · 4 comments

Comments

@saygoodbyye
Copy link

saygoodbyye commented Jun 26, 2023

I have got a server crash when executing the following script:

CREATE EXTENSION pg_hint_plan;
CREATE TABLE t1 (id int PRIMARY KEY, val int);
CREATE TABLE t2 (id int PRIMARY KEY, val int);
LOAD 'pg_hint_plan'; 
SET pg_hint_plan.parse_messages TO error;
/*+Set(work_mem "1M")*/
EXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;
/*+Set(work_mem "1MB")*/
EXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;

Server configuration:

./configure CFLAGS=" -Og" --enable-tap-tests --enable-cassert --enable-debug

I have used this branches:
postgres - master
pg_hint_plan - master

Also I've reproduced this crash on branches:
postgres - REL_15_STABLE
pg_hint_plan - PG15

Here is the backtrace:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f25f963c406 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f25f962287c in __GI_abort () at ./stdlib/abort.c:79
#5  0x000055709bc9ea7b in ExceptionalCondition (conditionName=conditionName@entry=0x55709be009fd "IsPointerList(list)", fileName=fileName@entry=0x55709be2be0e "list.c", lineNumber=lineNumber@entry=496) at assert.c:66
#6  0x000055709ba09c9d in lcons (datum=datum@entry=0x55709cfbe798, list=0x55709d06f0f8) at list.c:496
#7  0x00007f25f99744be in push_hint (hstate=0x55709cfbe798) at pg_hint_plan.c:2831
#8  0x00007f25f99746de in pg_hint_plan_planner (parse=0x55709cf8c480, query_string=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", cursorOptions=2048, boundParams=0x0)
    at pg_hint_plan.c:3125
#9  0x000055709ba88b3e in planner (parse=parse@entry=0x55709cf8c480, query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:279
#10 0x000055709bb67207 in pg_plan_query (querytree=querytree@entry=0x55709cf8c480, query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:904
#11 0x000055709b91b6c2 in ExplainOneQuery (query=0x55709cf8c480, cursorOptions=cursorOptions@entry=2048, into=into@entry=0x0, es=es@entry=0x55709cfbe4f8, 
    queryString=queryString@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", params=params@entry=0x0, queryEnv=0x0) at explain.c:406
#12 0x000055709b91c103 in ExplainQuery (pstate=pstate@entry=0x55709cfb58c0, stmt=stmt@entry=0x55709cf8c2c0, params=params@entry=0x0, dest=dest@entry=0x55709cfb5830) at explain.c:290
#13 0x000055709bb6d4fa in standard_ProcessUtility (pstmt=0x55709cf8c370, queryString=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;", readOnlyTree=<optimized out>, 
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55709cfb5830, qc=0x7ffc7d66d680) at utility.c:870
#14 0x000055709bb6d9c5 in ProcessUtility (pstmt=pstmt@entry=0x55709cf8c370, queryString=<optimized out>, readOnlyTree=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>, queryEnv=<optimized out>, 
    dest=0x55709cfb5830, qc=0x7ffc7d66d680) at utility.c:530
#15 0x000055709bb6ae8e in PortalRunUtility (portal=portal@entry=0x55709d010b68, pstmt=0x55709cf8c370, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x55709cfb5830, 
    qc=qc@entry=0x7ffc7d66d680) at pquery.c:1158
#16 0x000055709bb6b2da in FillPortalStore (portal=portal@entry=0x55709d010b68, isTopLevel=isTopLevel@entry=true) at pquery.c:1031
#17 0x000055709bb6b65d in PortalRun (portal=portal@entry=0x55709d010b68, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x55709d05f0f8, 
    altdest=altdest@entry=0x55709d05f0f8, qc=0x7ffc7d66d870) at pquery.c:763
#18 0x000055709bb6780f in exec_simple_query (query_string=query_string@entry=0x55709cf8b198 "/*+Set(work_mem \"1MB\")*/\nEXPLAIN (COSTS false) SELECT * FROM t1, t2 WHERE t1.id = t2.id;") at postgres.c:1274
#19 0x000055709bb69653 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4632
#20 0x000055709bacbc6b in BackendRun (port=port@entry=0x55709cfb8ca0) at postmaster.c:4461
#21 0x000055709bacdc98 in BackendStartup (port=port@entry=0x55709cfb8ca0) at postmaster.c:4189
#22 0x000055709bacde36 in ServerLoop () at postmaster.c:1779
#23 0x000055709bacf359 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x55709cf854f0) at postmaster.c:1463
#24 0x000055709b9f49a3 in main (argc=3, argv=0x55709cf854f0) at main.c:198

Best regards,
Egor Chindyaskin
Postgres Professional: http://postgrespro.com/

@Anisimov-ds
Copy link
Contributor

If pg_hint_plan.parse_messages level is set to ERROR, the setup_guc_enforcement() function breaks immediately when the SET hint has an error. In this case, the pop_hint() function is not called.

@michaelpq
Copy link
Collaborator

Thanks for the report and the patch! Will look into it!

@jjune235
Copy link

I had the same problem with PG11

michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
michaelpq added a commit that referenced this issue Jun 29, 2023
The planner hook uses a PG_TRY/PG_CATCH block to handle the hint stack
available for a given query.  push_hint() is in charge of updating the
stack with a new set in the planner hook, and pop_hint() would reset the
stack to its previous state.

It happens that the code was not cleaning up the stack between
push_hint() and the PG_TRY/PG_CATCH block, and that it was possible to
trigger ERRORs, corrupting the hint stack hold in a backend.  Based on
report #141, setting pg_hint_plan.parse_messages TO ERROR with an
incorrect Set hint would be one case where the stack would get
corrupted, because the GUCs defined in such hints are loaded after the
hint stack update and before PG_TRY().  Any errors happening before
PG_TRY() would be enough to corrupt the stack, like a palloc() failing
on OOM, though this is very unlikely.  This commit takes care of this
bug by removing completely the window where the stack could get
corrupted, by moving PG_TRY a little bit earlier.

A regression test is added, provided by reporter.  The patch is from
me.

Per report #141, from Egor Chindyaskin.

Backpatch-through: 11
@michaelpq
Copy link
Collaborator

The issue is now fixed. I have reworked at bit the code around push/pop so as we handle the stack consistently now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants