Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] Make async worker persistent #1992

Merged
merged 1 commit into from
Sep 9, 2024

Conversation

mwkmwkmwk
Copy link
Contributor

@mwkmwkmwk mwkmwkmwk commented Sep 3, 2024

Description of the changes

The async worker thread is currently spawned on-demand and terminated after an idle timeout. This is entirely unnecessary — the cost of having a persistent async thread is near-0, and it simplifies the logic.

Fixes gramineproject/graphene#440.

How to test this PR?

Already covered by CI.


This change is Reviewable

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+28 −98

nice!

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mwkmwkmwk)


-- commits line 8 at r1:
Please briefly describe what was that bug here (just a single sentence should be enough), we want to keep the git repo standalone and self-sufficient, in case we ever move to some other repo hosting.
And if you want to include a link to the GitHub issue, please use a full HTTPS URL.

@mwkmwkmwk mwkmwkmwk force-pushed the mwk/async-persistent branch from d6887ce to 180dbc5 Compare September 4, 2024 02:28
Copy link
Contributor Author

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)


-- commits line 8 at r1:

Previously, mkow (Michał Kowalczyk) wrote…

Please briefly describe what was that bug here (just a single sentence should be enough), we want to keep the git repo standalone and self-sufficient, in case we ever move to some other repo hosting.
And if you want to include a link to the GitHub issue, please use a full HTTPS URL.

Done.

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow and @mwkmwkmwk)

a discussion (no related file):
FYI: I think this code was left from the times when helper threads (including this Async helper thread) had a spinning implementation of waiting on events. Of course now with proper PalStreamsWaitEvents() that puts the thread to sleep on the host, this "terminate the thread when no events" logic is not needed.

Fun times of barely working code from 2010s...



-- commits line 4 at r2:
Typically, we write the current status in past tense, like Previously, the async worker thread spawned .... Should we continue with this style? Currently it sounds a bit weird -- the commit message is supposed to describe the new status (imho), but now it describes the old status.


libos/src/libos_async.c line 29 at r2 (raw file):

static LISTP_TYPE(async_event) async_list;

/* Should be accessed with async_worker_lock held. */
  1. This comment is placed in a way that makes you think that async_worker_running must also be accessed with the lock held -- this is not true.
  2. Also, do we even need the lock? Why not just use atomics for async_worker_shutdown? We'll save a couple more lines of code then.

libos/src/libos_async.c line 37 at r2 (raw file):

/* TODO: use async_worker_thread->pollable_event instead */
static struct libos_pollable_event install_new_event;

I would also prefer to add the prefix g_ to all these global variables (in a separate PR), but I won't block on this.


libos/src/libos_async.c line 327 at r2 (raw file):

        return -ENOMEM;

    async_worker_thread = new;

Can we move this global assignment to the very end of the func? This will be more in line with our conventions.


libos/src/libos_async.c line 356 at r2 (raw file):

    put_thread(async_worker_thread);
    async_worker_thread = NULL;
    PalObjectDestroy(handle);

I think we don't need these four lines of clean-up. This happens on process termination, so all resources will be cleaned up by the host kernel anyway.

Also, if you want to do clean-up, then you should also destroy the lock async_worker_lock and destroy install_new_event. Right now you have something in-between...

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mwkmwkmwk)


-- commits line 4 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Typically, we write the current status in past tense, like Previously, the async worker thread spawned .... Should we continue with this style? Currently it sounds a bit weird -- the commit message is supposed to describe the new status (imho), but now it describes the old status.

Yup, +1, I'd say "previously", otherwise it's not obvious whether it describes the state before or after this commit.


libos/src/libos_async.c line 29 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
  1. This comment is placed in a way that makes you think that async_worker_running must also be accessed with the lock held -- this is not true.
  2. Also, do we even need the lock? Why not just use atomics for async_worker_shutdown? We'll save a couple more lines of code then.
  1. Let's move it to a trailing comment at the end of the line then?
  2. Yeah, I guess we could. Whatever from my side, will leave the decision to you :)

libos/src/libos_async.c line 37 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would also prefer to add the prefix g_ to all these global variables (in a separate PR), but I won't block on this.

Yeah, let's fix it separately.


libos/src/libos_async.c line 356 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I think we don't need these four lines of clean-up. This happens on process termination, so all resources will be cleaned up by the host kernel anyway.

Also, if you want to do clean-up, then you should also destroy the lock async_worker_lock and destroy install_new_event. Right now you have something in-between...

Whatever @mwkmwkmwk decides, if choosing the first option, please add a comment explaining that this function is called when terminating the whole process so it doesn't need to clean up everything.

@mwkmwkmwk mwkmwkmwk force-pushed the mwk/async-persistent branch from 180dbc5 to c8711a9 Compare September 5, 2024 14:02
Copy link
Contributor Author

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)


-- commits line 4 at r2:

Previously, mkow (Michał Kowalczyk) wrote…

Yup, +1, I'd say "previously", otherwise it's not obvious whether it describes the state before or after this commit.

Done.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mwkmwkmwk)

Copy link
Contributor Author

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)


libos/src/libos_async.c line 29 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…
  1. Let's move it to a trailing comment at the end of the line then?
  2. Yeah, I guess we could. Whatever from my side, will leave the decision to you :)

removed the lock in favor of an atomic


libos/src/libos_async.c line 327 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can we move this global assignment to the very end of the func? This will be more in line with our conventions.

Done.


libos/src/libos_async.c line 356 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Whatever @mwkmwkmwk decides, if choosing the first option, please add a comment explaining that this function is called when terminating the whole process so it doesn't need to clean up everything.

removed

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwkmwkmwk: One more thing: we need your sign-off (see https://gramine.readthedocs.io/en/latest/devel/DCO/).

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners

@mwkmwkmwk mwkmwkmwk force-pushed the mwk/async-persistent branch from a916b92 to 6db986b Compare September 6, 2024 12:42
Previously, the async worker thread was spawned on-demand and terminated
after an idle timeout. This is entirely unnecessary — the cost of having
a persistent async thread is near-0, and it simplifies the logic.

Also fixes a long-standing issue about not waiting for the async thread
to exit (gramineproject/graphene#440).

Signed-off-by: Marcelina Kościelnicka <[email protected]>
@mwkmwkmwk mwkmwkmwk force-pushed the mwk/async-persistent branch from 6db986b to aef14f1 Compare September 6, 2024 12:42
Copy link
Contributor Author

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased and signed off

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@dimakuv dimakuv merged commit aef14f1 into gramineproject:master Sep 9, 2024
18 checks passed
@mwkmwkmwk mwkmwkmwk deleted the mwk/async-persistent branch September 9, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

try_process_exit() should wait for the async helper thread to exit
4 participants