Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash of pg_hint_plan with Postgres 16~ with join paths #139

Closed
michaelpq opened this issue Jun 22, 2023 · 5 comments
Closed

Crash of pg_hint_plan with Postgres 16~ with join paths #139

michaelpq opened this issue Jun 22, 2023 · 5 comments
Labels

Comments

@michaelpq
Copy link
Collaborator

michaelpq commented Jun 22, 2023

I have been able to trigger a crash of pg_hint_plan after using the following query on HEAD, when using Postgres 16:

DO $$DECLARE r record;
BEGIN
    FOR r IN SELECT /*+IndexScan(tables) */ table_schema, table_name FROM information_schema.tables
             WHERE table_type = 'VIEW' AND table_schema = 'public'
    LOOP
        EXECUTE 'GRANT ALL ON ' || quote_ident(r.table_schema) || '.' || quote_ident(r.table_name) || ' TO webuser';
    END LOOP;
END$$;

Here is a backtrace of the parts relevant to pg_hint_plan:

#0  0x00007fa2255f3ef4 in find_join_hint (joinrelids=0x7fa2234fe5e8) at pg_hint_plan.c:4052
#1  0x00007fa2255f4e59 in make_join_rel_wrapper (root=0x557524eee680, rel1=0x7fa2234ff078, rel2=0x557524eff620) at pg_hint_plan.c:4477
#2  0x00007fa2255f6a45 in make_rels_by_clause_joins (root=0x557524eee680, old_rel=0x7fa2234ff078, other_rels_list=0x7fa2234fe7d8, other_rels=0x7fa2234fe7f0)
    at /home/ioltas/git/pg_hint_plan/core.c:478
#3  0x00007fa2255f6506 in pg_hint_plan_join_search_one_level (root=0x557524eee680, level=4) at /home/ioltas/git/pg_hint_plan/core.c:288
#4  0x00007fa2255f6101 in pg_hint_plan_standard_join_search (root=0x557524eee680, levels_needed=4, initial_rels=0x7fa2234fe7d8) at /home/ioltas/git/pg_hint_plan/core.c:150
#5  0x00007fa2255f549a in pg_hint_plan_join_search (root=0x557524eee680, levels_needed=4, initial_rels=0x7fa2234fe7d8) at pg_hint_plan.c:4666
#6  0x00005575240eab7d in make_rel_from_joinlist (root=0x557524eee680, joinlist=0x7fa2234f6438) at allpaths.c:3381
#7  0x00005575240e5baf in make_one_rel (root=0x557524eee680, joinlist=0x7fa2234f6438) at allpaths.c:229
#8  0x0000557524126347 in query_planner (root=0x557524eee680, qp_callback=0x55752412ca75 <standard_qp_callback>, qp_extra=0x7ffe6512bc70) at planmain.c:278

I was first suspecting something in 21dfca13, but the crash reproduces as well on a build that reverts it, so it seems to me that something is not in line with the module's code and the recent planner changes.

@saygoodbyye
Copy link

Hello! I've got server crash that provided the same backtrace as you presented, but the reproduction is different:

LOAD 'pg_hint_plan';
/*+Leading(ft_1 ft_2 t1)*/
SELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';

And the backtrace:

#0  0x00007fb2ff0f098d in find_join_hint (joinrelids=joinrelids@entry=0x7fb2ff00cb30) at pg_hint_plan.c:4108
#1  0x00007fb2ff0f96de in make_join_rel_wrapper (root=root@entry=0x7fb2ff0d9e20, rel1=rel1@entry=0x7fb2ff0e4b88, rel2=rel2@entry=0x7fb2ff0e4430) at pg_hint_plan.c:4533
#2  0x00007fb2ff0f9826 in make_rels_by_clause_joins (root=root@entry=0x7fb2ff0d9e20, old_rel=old_rel@entry=0x7fb2ff0e4b88, other_rels_list=other_rels_list@entry=0x7fb2ff00a9e8, other_rels=<optimized out>)
    at /home/egorchin/contribs/pg_hint_plan/core.c:478
#3  0x00007fb2ff0f99fd in pg_hint_plan_join_search_one_level (root=root@entry=0x7fb2ff0d9e20, level=level@entry=3) at /home/egorchin/contribs/pg_hint_plan/core.c:288
#4  0x00007fb2ff0f9dd9 in pg_hint_plan_standard_join_search (root=root@entry=0x7fb2ff0d9e20, levels_needed=levels_needed@entry=3, initial_rels=initial_rels@entry=0x7fb2ff00a9e8) at /home/egorchin/contribs/pg_hint_plan/core.c:150
#5  0x00007fb2ff0f9fbc in pg_hint_plan_join_search (root=0x7fb2ff0d9e20, levels_needed=3, initial_rels=0x7fb2ff00a9e8) at pg_hint_plan.c:4722
#6  0x000055fa4376310b in make_rel_from_joinlist (root=root@entry=0x7fb2ff0d9e20, joinlist=joinlist@entry=0x7fb2ff0e5b98) at allpaths.c:3381
#7  0x000055fa4376322b in make_one_rel (root=root@entry=0x7fb2ff0d9e20, joinlist=joinlist@entry=0x7fb2ff0e5b98) at allpaths.c:229
#8  0x000055fa4378b094 in query_planner (root=root@entry=0x7fb2ff0d9e20, qp_callback=qp_callback@entry=0x55fa4378c004 <standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffe93b3bc50) at planmain.c:281
#9  0x000055fa4379258d in grouping_planner (root=root@entry=0x7fb2ff0d9e20, tuple_fraction=<optimized out>, tuple_fraction@entry=0) at planner.c:1515
#10 0x000055fa43793bb8 in subquery_planner (glob=<optimized out>, parse=parse@entry=0x7fb2ff0cee68, parent_root=parent_root@entry=0x55fa444c64c8, hasRecursion=hasRecursion@entry=false, tuple_fraction=0) at planner.c:1084
#11 0x000055fa4376113d in set_subquery_pathlist (root=root@entry=0x55fa444c64c8, rel=rel@entry=0x7fb2ff0cd818, rti=rti@entry=2, rte=rte@entry=0x55fa444fe8b0) at allpaths.c:2649
#12 0x000055fa437614cd in set_rel_size (root=root@entry=0x55fa444c64c8, rel=rel@entry=0x7fb2ff0cd818, rti=rti@entry=2, rte=rte@entry=0x55fa444fe8b0) at allpaths.c:424
#13 0x000055fa437615fc in set_base_rel_sizes (root=root@entry=0x55fa444c64c8) at allpaths.c:325
#14 0x000055fa4376318d in make_one_rel (root=root@entry=0x55fa444c64c8, joinlist=joinlist@entry=0x7fb2ff0ce428) at allpaths.c:186
#15 0x000055fa4378b094 in query_planner (root=root@entry=0x55fa444c64c8, qp_callback=qp_callback@entry=0x55fa4378c004 <standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffe93b3bf80) at planmain.c:281
#16 0x000055fa4379258d in grouping_planner (root=root@entry=0x55fa444c64c8, tuple_fraction=<optimized out>, tuple_fraction@entry=0) at planner.c:1515
#17 0x000055fa43793bb8 in subquery_planner (glob=glob@entry=0x55fa444fe570, parse=parse@entry=0x55fa443f6048, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0)
    at planner.c:1084
#18 0x000055fa437941c7 in standard_planner (parse=parse@entry=0x55fa443f6048, 
    query_string=query_string@entry=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:413
#19 0x00007fb2ff0f597a in pg_hint_plan_planner (parse=0x55fa443f6048, 
    query_string=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';", cursorOptions=2048, 
    boundParams=0x0) at pg_hint_plan.c:3185
#20 0x000055fa437947ca in planner (parse=parse@entry=0x55fa443f6048, 
    query_string=query_string@entry=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:279
#21 0x000055fa43873225 in pg_plan_query (querytree=querytree@entry=0x55fa443f6048, 
    query_string=query_string@entry=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:904
#22 0x000055fa438732db in pg_plan_queries (querytrees=0x55fa444c3ae0, 
    query_string=query_string@entry=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:996
#23 0x000055fa43873744 in exec_simple_query (
    query_string=query_string@entry=0x55fa443f4870 "/*+Leading(ft_1 ft_2 t1)*/\nSELECT relname, seq_scan > 0 as seq_scan, idx_scan > 0 as idx_scan FROM pg_stat_user_tables WHERE schemaname = 'public' AND relname = 't1';")
    at postgres.c:1193
#24 0x000055fa4387567f in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#25 0x000055fa437d7acd in BackendRun (port=port@entry=0x55fa44423810) at postmaster.c:4433
#26 0x000055fa437d9afa in BackendStartup (port=port@entry=0x55fa44423810) at postmaster.c:4161
#27 0x000055fa437d9c98 in ServerLoop () at postmaster.c:1778
#28 0x000055fa437db1bb in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x55fa443ee960) at postmaster.c:1462
#29 0x000055fa436ff7a3 in main (argc=3, argv=0x55fa443ee960) at main.c:198

I've used this branches:
postgres - master
pg_hint_plan - master

Server configuration:

./configure CFLAGS=" -Og" --enable-tap-tests --enable-cassert --enable-debug

@mikecaat
Copy link
Contributor

mikecaat commented Aug 18, 2023

Hi,

Although I'm not yet able to understand the reason well, I identify postgres/postgres@2489d76 making the error.

You can reproduce following combinations.

# the backtrace
pg_hint_plan.so!find_join_hint(Relids joinrelids) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\pg_hint_plan.c:4002)
pg_hint_plan.so!make_join_rel_wrapper(PlannerInfo * root, RelOptInfo * rel1, RelOptInfo * rel2) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\pg_hint_plan.c:4427)
pg_hint_plan.so!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo * old_rel, List * other_rels_list, ListCell * other_rels) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\core.c:478)
pg_hint_plan.so!pg_hint_plan_join_search_one_level(PlannerInfo * root, int level) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\core.c:288)
pg_hint_plan.so!pg_hint_plan_standard_join_search(PlannerInfo * root, int levels_needed, List * initial_rels) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\core.c:150)
pg_hint_plan.so!pg_hint_plan_join_search(PlannerInfo * root, int levels_needed, List * initial_rels) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\pg_hint_plan.c:4616)
make_rel_from_joinlist(PlannerInfo * root, List * joinlist) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:3315)
make_one_rel(PlannerInfo * root, List * joinlist) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:211)
query_planner(PlannerInfo * root, query_pathkeys_callback qp_callback, void * qp_extra) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planmain.c:278)
grouping_planner(PlannerInfo * root, double tuple_fraction) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planner.c:1487)
subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, _Bool hasRecursion, double tuple_fraction) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planner.c:1056)
set_subquery_pathlist(PlannerInfo * root, RelOptInfo * rel, Index rti, RangeTblEntry * rte) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:2582)
set_rel_size(PlannerInfo * root, RelOptInfo * rel, Index rti, RangeTblEntry * rte) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:406)
set_base_rel_sizes(PlannerInfo * root) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:307)
make_one_rel(PlannerInfo * root, List * joinlist) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\path\allpaths.c:168)
query_planner(PlannerInfo * root, query_pathkeys_callback qp_callback, void * qp_extra) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planmain.c:278)
grouping_planner(PlannerInfo * root, double tuple_fraction) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planner.c:1487)
subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, _Bool hasRecursion, double tuple_fraction) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planner.c:1056)
standard_planner(Query * parse, const char * query_string, int cursorOptions, ParamListInfo boundParams) (\home\ikedamsh\repos\psql\master\postgresql-master\src\backend\optimizer\plan\planner.c:411)
pg_hint_plan.so!pg_hint_plan_planner(Query * parse, const char * query_string, int cursorOptions, ParamListInfo boundParams) (\home\ikedamsh\repos\psql\extension\pg_hint_plan\pg_hint_plan.c:3130)

mikecaat pushed a commit to mikecaat/pg_hint_plan that referenced this issue Aug 22, 2023
find_join_hint() calculates the join level using joinrelids which
is a bitmapset of relations within the join with the assumption that
the relids has only base relations.

But, relids became to have outer-join relids with PostgreSQL
commit (2489d76c). So, the calculation logic became wrong and
crash on some condition.

The commit fixes the logic to calculate joinrelids to consider
only base relations.

This patch is for PoC. I think we need to think the following.

* Is the test enough and stable?
* Don't we need to change OuterInnerJoinCreate() and
  transform_join_hints which call find_join_hint()?
@mikecaat
Copy link
Contributor

mikecaat commented Aug 22, 2023

IIUC, the reason of crash is the relids can have outer-join relids currently. I make an above PoC patch to solve the issue. Since I'm not familiar with the planner code and I'm studying now, please let me know if you have any comment.

BTW, since the pg_hint_plan tests fails with current PostgreSQL head, please try with postgres/postgres@a8b7424

@michaelpq
Copy link
Collaborator Author

Since I'm not familiar with the planner code and I'm studying now, please let me know if you have any comment.

With PG16 coming soon at the door, we are going to do something here very soon. I'll try to look at your patch. (I'm a bit overdue with the SQL script updates, as well..)

@michaelpq
Copy link
Collaborator Author

BTW, since the pg_hint_plan tests fails with current PostgreSQL head, please try with postgres/postgres@a8b7424

The tests now pass on current Postgres HEAD. I've fixed the issues with EXPLAIN and the costs.

michaelpq added a commit that referenced this issue Aug 30, 2023
find_join_hint() calculates the join level using joinrelids which
is a bitmapset of relations.  However, since the upstream commit
2489d76c introduced in 16, it is possible that these include outer-join
relids, something that pg_hint_plan cannot digest because it expects
only case relations.

This commit filters out any outer-join relids from the lists of relids
considered during the join calculations.  This could cause crashes
because of corrupted relids lists.

There may be a smarter way to fix that, but this is proving to be good
enough for now.

Per issue #139.

Reported-by: Michael Paquier
Author: Masahiro Ikeda
Backpatch-through: 16
michaelpq added a commit that referenced this issue Aug 30, 2023
find_join_hint() calculates the join level using joinrelids which
is a bitmapset of relations.  However, since the upstream commit
2489d76c introduced in 16, it is possible that these include outer-join
relids, something that pg_hint_plan cannot digest because it expects
only case relations.

This commit filters out any outer-join relids from the lists of relids
considered during the join calculations.  This could cause crashes
because of corrupted relids lists.

There may be a smarter way to fix that, but this is proving to be good
enough for now.

Per issue #139.

Reported-by: Michael Paquier
Author: Masahiro Ikeda
Backpatch-through: 16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants