Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash caused by stale owner #78997

Merged
merged 1 commit into from
Dec 8, 2023

Conversation

Sauermann
Copy link
Contributor

@Sauermann Sauermann commented Jul 3, 2023

Adjust NOTIFICATION_PREDELETE in Node to clean up owned nodes.
Also print a warning, when the owner becomes invalid.
resolve #78736

Updated 2023-11-17: Rebased

@Sauermann Sauermann requested a review from a team as a code owner July 3, 2023 19:25
@YuriSizov YuriSizov added this to the 4.2 milestone Jul 3, 2023
@RandomShaper
Copy link
Member

I see cases where this is needed, but I think there's still a case to fix: in the MRP in #78736, shouldn't the child_node.reparent(self) break the ownership of child_node by owner_node?

@Sauermann
Copy link
Contributor Author

Sauermann commented Jul 4, 2023

I have investigated this case and I was unable to create a crash by reparenting after the changes of this PR.

Still I believe, that there is a bug, that currently reparenting allows a Node to have an owner, that is not an ancestor (which contradicts the owner-description).
I was unable to finish investigating this bug yesterday and intend to look into this again.

@Sauermann
Copy link
Contributor Author

I have updated the PR with an additional verification, that the owner is always an ancestor and added a warning about this case.

I have also considered as a more strict alternative in Node::add_child, but would consider it as too restrictive:

	ERR_FAIL_COND_MSG(p_child->data.owner && p_child->data.owner != this && !p_child->data.owner->is_ancestor_of(this), vformat("Can't add child '%s' to '%s', owner '%s' would no longer be an ancestor.", p_child->get_name(), get_name(), p_child->data.owner->get_name()));

I have also considered as a less strict alternative to adjust the documentation of Node.owner, but had the impression that this would find not enough attention and users might not easily find the reason, why their node's owner became null:

[b]Note:[/b] The owner of the [Node] is automatically erased, if the node is moved to a place in the scene tree where its owner is no longer an ancestor.

@RandomShaper
Copy link
Member

I agree with that approach.

I'm wondering if it should be expected that a node still has owned nodes by the time the destructor is run. Shouldn't all the previous mechanics of the owner and/or owned nodes (pre-delete, exit tree notifications, etc., etc.) have cleaned things up?

@Sauermann
Copy link
Contributor Author

One of the MRPs in the linked issue sets the owner of a Node, while that node is not in the tree. So tree-notifications don't help in these cases.
Also when handling this during tree-notifications, the owner would always be removed when a Node is relocated in the scene tree, which would be bad for usability.

I was not aware of pre-delete, thanks for bringing this to my attention.
During NOTIFICATION_PREDELETE currently only data.owner is cleared, but not data.owned.
I believe, that it would be conceptually better, if I move the data.owned-clean-up code from ~Node to NOTIFICATION_PREDELETE.

@Sauermann Sauermann force-pushed the fix-owner-crash branch 3 times, most recently from 99492a6 to 6b6cd99 Compare July 6, 2023 05:29
@YuriSizov YuriSizov modified the milestones: 4.2, 4.3 Oct 27, 2023
@YuriSizov
Copy link
Contributor

Slightly relevant here: we now have a PR that changes how reparenting works #81506.
But I guess that wouldn't completely eliminate invalid cases we are trying to prevent here? I would err on the side of caution at this point and bump this to 4.3.

But an approval from @RandomShaper would still be appreciated so we can merge it as soon as the next dev cycle starts 🙂

Comment on lines 1371 to 1394
if (p_child->data.owner && !p_child->data.owner->is_ancestor_of(p_child)) {
// Owner of p_child must be ancestor of p_child.
WARN_PRINT(vformat("Adding '%s' as child to '%s' would make owner '%s' inconsistent. Erasing owner. Consider removing the owner beforehand.", p_child->get_name(), get_name(), p_child->data.owner->get_name()));
p_child->_clean_up_owner();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this scenario should ever be possible. If it's possible, guards are missing elsewhere. But if you want to be extra sure, maybe this should only happen in editor builds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this check to add_child() instead, which is anyway the only caller of _add_child_nocheck() (possibly, to a counterpart of _validate_child_name() named _validate_child_future_ownership(node, intended_new_parent)). That way, _add_child_nocheck() still honors its name (althought the no check part was about names, but now we're validating owners, it's good to extend that notion).

Aside, I think that with this we would be catching every possibility of a wrong owner. That said, given that this may have a performance impact, I see value in restricting the check to debug builds, and making it just a check. I mean, keep the owner (even if invalid), not to change behavior across debug and non-debug. After all, I don't think owner chain integrity is an internal requisite of the engine (engine won't crash or misbehave due to an inconsitency there; only the user may get unexpected results).

Finally, I'd suggest changing the wording so erasing/removing owner is expressed as unsetting the owner instead, for enhanced clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the validation of the owner to add_child() within a DEBUG_ENABLED section.

Please note that _add_child_nocheck() is additionally called from SceneState::instantiate. But since this function deals with instantiating nodes, my guess would be, that it's unnecessary to add checks there.

I would prefer to completely prohibit inconsistent states (so that the user no longer gets unexpected results), but based on the performance-argument, I assume that a warning can be considered sufficient.

For good measure, I have also checked, that the MRP in the linked issue still crashes in v4.2.rc.custom_build [80de898] and gets fixed by this PR.

Adjust `NOTIFICATION_PREDELETE` in `Node` to clean up owned nodes.
Also print a warning, when the owner becomes invalid.
@akien-mga akien-mga added the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Nov 24, 2023
@YuriSizov YuriSizov merged commit b6c1573 into godotengine:master Dec 8, 2023
15 checks passed
@YuriSizov
Copy link
Contributor

Thanks!

@Sauermann Sauermann deleted the fix-owner-crash branch December 29, 2023 18:08
@YuriSizov YuriSizov removed the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Jan 25, 2024
@YuriSizov
Copy link
Contributor

Cherry-picked for 4.2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash when freeing Node with invalid owner
4 participants