-
Notifications
You must be signed in to change notification settings - Fork 261
LibOS/ipc, async helper: keep refcount to helper thread on exit #321
Conversation
Can one of the admins verify this patch? |
2 similar comments
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify the design: The idea here is that the IPC helper should never be the last one to exit. Rather, the IPC helper thread, when terminated, passes its thread object back to the last "real" thread, which frees the resources?
Reviewable status: 0 of 5 files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion (waiting on @yamahata)
a discussion (no related file):
+1 to @donporter's question.
Also, could you please write in commit messages what particular commit does and why, not just what's bug? This way it's easier for reviewers to understand it and more consistent with our repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion (waiting on @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
+1 to @donporter's question.
Also, could you please write in commit messages what particular commit does and why, not just what's bug? This way it's easier for reviewers to understand it and more consistent with our repository.
sure. How about adding the following.
So the solution is to keep reference count of those two helper threads by a (usually master) thread so that the (master) thread will be the last thread to release the reference count. So self destruction can be avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion (waiting on @mkow)
a discussion (no related file):
Previously, yamahata wrote…
sure. How about adding the following.
So the solution is to keep reference count of those two helper threads by a (usually master) thread so that the (master) thread will be the last thread to release the reference count. So self destruction can be avoid.
This works for me. Feel free to add some of the verbiage from my question if you like. I think some comments are definitely in order here. Thank you for the fix!
8e2de78
to
cc4b1c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 5 files at r2.
Reviewable status: 2 of 5 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required) (waiting on @mkow and @yamahata)
LibOS/shim/src/ipc/shim_ipc_helper.c, line 1085 at r2 (raw file):
* on success, the reference to the helper thread is returned with * reference count incremented. * It's caller the responsibility to wait for its exit and release the
Typo (and remove pronoun) - please rephrase as "The caller is responsible to wait for the IPC helper thread to exit ......"
LibOS/shim/src/sys/shim_exit.c, line 147 at r2 (raw file):
struct shim_thread * async_thread = terminate_async_helper(); if (async_thread) /* TODO: wait for the thread to exit in host */
This seems like an important to-do. Perhaps at least open an issue tracker for this one? It is probably no worse than it is now, but it seems to re-introduce the same bug?
LibOS/shim/src/sys/shim_exit.c, line 153 at r2 (raw file):
int ret = exit_with_ipc_helper(true, &ipc_thread); if (ipc_thread) /* TODO: wait for the thread to exit in host */
Same comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required) (waiting on @yamahata)
a discussion (no related file):
Previously, donporter (Don Porter) wrote…
This works for me. Feel free to add some of the verbiage from my question if you like. I think some comments are definitely in order here. Thank you for the fix!
Looks good now :)
LibOS/shim/src/shim_async.c, line 322 at r2 (raw file):
/* * On succes, the reference to the thread of async helper is returned with
succes
-> success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ) (waiting on @donporter and @mkow)
LibOS/shim/src/shim_async.c, line 322 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
succes
->success
Done.
LibOS/shim/src/ipc/shim_ipc_helper.c, line 1085 at r2 (raw file):
Previously, donporter (Don Porter) wrote…
Typo (and remove pronoun) - please rephrase as "The caller is responsible to wait for the IPC helper thread to exit ......"
Done.
LibOS/shim/src/sys/shim_exit.c, line 147 at r2 (raw file):
Previously, donporter (Don Porter) wrote…
This seems like an important to-do. Perhaps at least open an issue tracker for this one? It is probably no worse than it is now, but it seems to re-introduce the same bug?
Totally agreed.
#440
LibOS/shim/src/sys/shim_exit.c, line 153 at r2 (raw file):
Previously, donporter (Don Porter) wrote…
Same comment.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ) (waiting on @donporter and @mkow)
LibOS/shim/src/sys/shim_exit.c, line 147 at r2 (raw file):
Previously, yamahata wrote…
Totally agreed.
#440
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ) (waiting on @mkow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required)
Retest this please |
@yamahata Please go ahead and squash + rebase to current master. Thanks! |
f497750
to
0585959
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 5 files at r2.
Reviewable status: complete! all files reviewed, all discussions resolved
Ok to test |
Using put_thread(self) to free resources by itself is problematic because the exiting thread is still using them. Actually, put_thread(self) logic in ipc/async helper may cause SEGV depending on who is the last to put_thread(). If the thread is the last one to free thread area, lock/unlock after freeing thread area causes SEGV due to debug message. The solution is to keep reference count of those two helper threads by a (usually master) thread so that the (master) thread will be the last thread to release the reference count. This way self destruction can be avoided. To completely fix this issue, helper threads need to be waited to exit. The issue is tracked by gramineproject#440. Signed-off-by: Isaku Yamahata <[email protected]>
0585959
to
e6212cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved
put_thread(self) to free resources by itself is problematic because
the exiting thread is still using them.
Actually put_thread(self) logic in ipc/async helper may cause SEGV
depending on who is the last to put_thread().
If the thread is the last one to free thread area, lock/unlock after
freeing thread area causes segv due to debug message.
Signed-off-by: Isaku Yamahata [email protected]
This change is