-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supplying IndexScan hint with unavailable index gives unexpected results. #136
Comments
Thanks for the report. Hm, I see. Indeed, once the index is renamed, it feels more natural to fall back to the index renamed if the IndexScan hint does not include the index wanted, rather than blindly switch to an IndexScan. The fallback plan would depend on the costs, but with enough tuples I would not have expected a sequential scan to be chosen as fallback. |
Correct, and in fact if all the supplied indexes have the wrong name, the only available path becomes the sequential scan. Thinking about this a bit more, it seems that the correct behavior here is to make the entire hint invalid if it contains an index that does not exist and debug print should make it clear that the hint is not being used. I prepared a patch to do something along that line. In the example below, I created a table with foo_idx1 and foo_idx2
if the IndexScan hint is supplied indexes that do not exist, and even if one of the indexes does in fact exist, it should invalidate the entire hint.
with a valid index, the hint is used.
Let me know your thoughts and I can complete the patch with proper test cases. |
Agreed. That feels more natural to me to just invalidate the whole hint if it includes at least one index that does not exist. It seems to me that we should warn about that at parsing, at least? The module is rather soft when it comes to error handling. |
I don't think it will be possible to do this check at hint parse time since we don't have the list of available indexes at that point. I will double check and prepare a patch for review. |
While this is the ideal solution, upon further investigation it's really not possible without introducing a great deal of complexity to deal with partitioned tables. For non-partitioned tables, it's quite simple to validate that all hinted indexes are in the table. However, for partitioned tables the restrict_indexes function is called for each child table ( via the set_rel_pathlist_hook ) and and indexes in the hint that may not be available to one of the children may apply to another child. For example:
So, it's still a good idea to make the IndexScan safer by simply making sure that we are not destroying the entire index list if none of the indexes in the hint are available.
Attached below is a patch with tests. 0001-Fix-handling-of-unavailable-indexes-in-IndexScan-hin.patch |
I have looked at the tests and.. Hmm. This diff and things around that are giving me a very long pause. The hint specified in this case allows one to specify an index that will be used across one partition while the others would use a SeqScan, but the patch would ignore the index name specified and force an IndexScan across all the partitions. Okay, one can use /*+IndexScan(p2_c1 p2_c1_id_val_idx)*/ to mimic the previous behavior, still the other test changes also clearly document what happens when an index does not exist. Anyway, with this patch,, I have found what looks like a case where the hint gets entirely broken. A regular expression with a BitmapScan still forces a BitmapScan even if no indexes match with the expression given. This is tracked by the following case: That does not look OK to me, because I would not expect BitmapScan in this case. I am not sure how to attach a file to this reply, but I have pushed a patch on a local branch to keep track of the plan changes that are affect by this change in the regression tests: |
These 2 cases are actually related to the same incorrect behavior the patch introduces. The issue is that the scan method is enforced even without matching indexes. You can think of it as So, in v2 of the patch, I think this is handled better. if we are going to end up deleting all the indexes, don't delete anything, and mark the scan hint as UNUSED. Also, don't enforce the scan method. This is essentially as if we did not supply the hint to begin with. Therefore, instead of forcing a sequential scan, we let the planning occur as if there was no scan hint supplied. the v2 test changes are also a lot more palatable. Regards, |
Based on the patch, this query:
Changes to the following:
Okay, I am not surprised by this behavior in the patch. The user tries an IndexScan with a regexp, there is no index matching, and we just fallback to the default of an index-only scan.
This one is also less confusing than HEAD. The regexp does not match any of the indexes on the relations, falls back to a SeqScan. And this switches the hint from "used" to "not used", which is also correct because we don't use it. As a whole, the regression tests are indeed much easier to understand this way. Regarding the patch, I had a few comments:
This clearly deserves a comment. In short, it seems to me that we are in the case where we don't have any indexes left, and that we'd better explain why this is set? Anyway, do we need this flag at all? Why isn't HintStatus sufficient to track if a hint should be used or not.
s/mehthod/method/
Hmm. Not sure that this is completely correct. |
good point. I cleaned this up.
I added a clarifying comment
HintStatus->base.state will track if a hint is used overall. If any of the relations ( parent or child ) in the query use the hint, we mark the hint as USED. the set_scan_method is needed to track if we should enforce the scan method hint for each relation. I also renamed set_scan_method to skip_scan_method and set the default as false.
fixed
This is not needed in our patch as the HINT_STATE_USED is set by setup_scan_method_enforcement. By keeping this line of code, we are saying that the hint is used before we call restrict_indexes to see if that is really true. See v3: Regards, |
As showed by the regression tests, this fixes a lot of correctness issues behind the hints considered as "used" by the module but not actually used because the index restrictions were applied before enforcing a scan method. For example, an index IndexScan with a list of indexes not used caused two problems: - The hint would be marked as used, but it resulted in being not used, falling down to a sequential scan. - Discarding all the indexes in the hint could cause invisible regressions, as the hint would fail to consider any existing hints. There was also an extra case with IndexScanRegexp where having a list of indexes available turns out to cause a sequential scan and still mark the hint as used. This commit now prevents restrict_indexes from removing any indexes from the relation index list if none of the hinted indexes are available, and switches the scan enforcement method to be after the index restrictions are applied. This safeguard prevents an IndexScan hint supplied without any available indexes to result in a sequential scan, for example. Per discussion on #136. Author: Sami Imseih Backpatch-through: 13
As showed by the regression tests, this fixes a lot of correctness issues behind the hints considered as "used" by the module but not actually used because the index restrictions were applied before enforcing a scan method. For example, an index IndexScan with a list of indexes not used caused two problems: - The hint would be marked as used, but it resulted in being not used, falling down to a sequential scan. - Discarding all the indexes in the hint could cause invisible regressions, as the hint would fail to consider any existing hints. There was also an extra case with IndexScanRegexp where having a list of indexes available turns out to cause a sequential scan and still mark the hint as used. This commit now prevents restrict_indexes from removing any indexes from the relation index list if none of the hinted indexes are available, and switches the scan enforcement method to be after the index restrictions are applied. This safeguard prevents an IndexScan hint supplied without any available indexes to result in a sequential scan, for example. Per discussion on #136. Author: Sami Imseih Backpatch-through: 13
As showed by the regression tests, this fixes a lot of correctness issues behind the hints considered as "used" by the module but not actually used because the index restrictions were applied before enforcing a scan method. For example, an index IndexScan with a list of indexes not used caused two problems: - The hint would be marked as used, but it resulted in being not used, falling down to a sequential scan. - Discarding all the indexes in the hint could cause invisible regressions, as the hint would fail to consider any existing hints. There was also an extra case with IndexScanRegexp where having a list of indexes available turns out to cause a sequential scan and still mark the hint as used. This commit now prevents restrict_indexes from removing any indexes from the relation index list if none of the hinted indexes are available, and switches the scan enforcement method to be after the index restrictions are applied. This safeguard prevents an IndexScan hint supplied without any available indexes to result in a sequential scan, for example. Per discussion on #136. Author: Sami Imseih Backpatch-through: 13
As showed by the regression tests, this fixes a lot of correctness issues behind the hints considered as "used" by the module but not actually used because the index restrictions were applied before enforcing a scan method. For example, an index IndexScan with a list of indexes not used caused two problems: - The hint would be marked as used, but it resulted in being not used, falling down to a sequential scan. - Discarding all the indexes in the hint could cause invisible regressions, as the hint would fail to consider any existing hints. There was also an extra case with IndexScanRegexp where having a list of indexes available turns out to cause a sequential scan and still mark the hint as used. This commit now prevents restrict_indexes from removing any indexes from the relation index list if none of the hinted indexes are available, and switches the scan enforcement method to be after the index restrictions are applied. This safeguard prevents an IndexScan hint supplied without any available indexes to result in a sequential scan, for example. Per discussion on #136. Author: Sami Imseih Backpatch-through: 13
Sorry for the delay in replying back. I have finished looking at that, and I think that v3 is mostly good. One thing I did not like is that we'd want to track if the scan method enforcement should happen or not depending on the static state in the current hint while the check is local to a specific code path. So I have changed things so as restrict_indexes() returns a status result instead, that's afterwards used to see if the second step should happen or not. Another thing was that the patch was getting iffy in PG11 and PG12 due to the changes in list handlings, so I have skipped these branches, applying the patch down to PG13. With PG11 being EOL at the next release point, changing this behavior was a bit stressing. PG12 could be changed, perhaps. |
Thanks! I'm OK with backpatching down to 13 for now. |
If an index is dropped or renamed and the hint is not updated, this can cause some unexpected behavior and performance regression until the hint is corrected.
The reason for this is becuase restrict_indexes compares for each rel index if it's in a hint, and if it's not remove it immediately from consideration. So, in cases where all the indexes in the hint are no longer viable because they've been renamed, then we end up unexpected results.
I think it's better to have a safeguard in which if we are about to delete all the indexes, then don't delete any indexes at all.
Here is an idea for a patch below. Instead of deleting the idnexes immediately, store the index OIDs of the candidate indexes to delete. If at the end we observe we are about to delete all the indexes, don't do anything.
Regards,
Sami Imseih
Amazon Web Services (AWS)
The text was updated successfully, but these errors were encountered: