[LibOS] Add support for `getcpu` syscall #609

vijaydhanraj · 2022-05-26T02:13:33Z

Description of the changes

This PR was dependent on sysfs rewrite. Now that it is done, reviving this PR.

This commit adds an approx. implementation of getcpu syscall that returns a random set bit from cpu affinity mask associated
with calling thread and its corresponding NUMA node.

pthread_set_get_affinity regression test is extended to validate this implementation. This commit also enables the
getcpu01 ltp test for non-SGX case.

Closes gramineproject/graphene#2331

How to test this PR?

Please run pthread_set_get_affinity LibOS regression test.

This change is

dimakuv

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 230 at r1 (raw file):

    __UNUSED(unused);

    int ret = 0;

Let's do int ret;, without initializing to zero. The newer GCC versions can then perform additional checks that ret was set in all exiting paths.

LibOS/shim/src/sys/shim_sched.c line 258 at r1 (raw file):

    unsigned int idx = 0;
    while (idx < max_cpu_bitmask) {
        num_bits = count_ulong_bits_set(mask[idx]);

Why do we use count_ulong_bits_set() if we only need to check whether this long is empty (== 0) or not (!= 0)?

Also see my comment/proposal below.

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

         * us to the nth_setbit after nth_setbit iterations. */
        cpumask = cpumask & ~(1UL << __builtin_ctzl(cpumask));
    }

I forgot how our discussion ended in the previous version of this PR.

But why can't we simply choose a random CPU, and not the "random CPU in the first non-empty long"? Wouldn't it easier and more straight-forward, smth like this:

    size_t mask_idx = 0;
    size_t nth_setbit = rand_num % threads_cnt;

    /* At each iteration, find the lowest bit set in cpu mask and unset it; this will bring
     * us to the nth_setbit after nth_setbit iterations. */
    size_t i = 0;
    while (i < nth_setbit) {
        if (mask[mask_idx] == 0) {
            mask_idx++;
            /* TODO: check that we don't overflow mask */
            continue;
        }

        mask[mask_idx] = mask[mask_idx] & ~(1UL << __builtin_ctzl(mask[mask_idx]));
        i++;
    }

unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) + BITS_IN_TYPE(unsigned long) * mask_idx;

LibOS/shim/test/regression/pthread_set_get_affinity.c line 37 at r1 (raw file):

    int ret = syscall(SYS_getcpu, &cpu, &node);
    if (ret < 0)
        err(EXIT_FAILURE, "sched_getcpu failed!");

sched_getcpu -> getcpu

LibOS/shim/test/regression/pthread_set_get_affinity.c line 39 at r1 (raw file):

        err(EXIT_FAILURE, "sched_getcpu failed!");

    printf("Thread %ld is running on cpu: %d, node: %d\n", syscall(SYS_gettid), cpu, node);

The formatters for cpu and node must be %u (not %d)

vijaydhanraj

Reviewable status: 1 of 3 files reviewed, 5 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

LibOS/shim/src/sys/shim_sched.c line 230 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Let's do int ret;, without initializing to zero. The newer GCC versions can then perform additional checks that ret was set in all exiting paths.

Sure, I would also like to add a ret = 0 before returning as well even though all the DK* functions return 0 on success. Just in case the return values of Dk*functions change in the future.

LibOS/shim/src/sys/shim_sched.c line 258 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do we use count_ulong_bits_set() if we only need to check whether this long is empty (== 0) or not (!= 0)?

Also see my comment/proposal below.

For either approach (please see my response), we need count_ulong_bits_set() to count the number of cores that are part of the thread's affinity list. We then select any random core from that list.

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

I forgot how our discussion ended in the previous version of this PR.

I don't think we had any conclusion. IIRC, we discussed two options:

Find the first non-empty cpumask and select a random bit set from it.
Count all the cores present (bit set) in the cpumask and then randomly set a CPU from it.

But why can't we simply choose a random CPU, and not the "random CPU in the first non-empty long"? Wouldn't it easier and more straight-forward, smth like this:

The code snippet below aligns with option 2. Yes, we can do it either way but I preferred option 1, as we can stop right after we find a non-empty cpumask as we just need to return a random CPU present in the thread's affinity and not keep iterating futher.

With the suggested code, I don't think we can randomly select any core from threads_cnt. We first need to find the cores that are part of the thread's affinity list and then select any random core from that. So, the code will look something like the below which is very similar to my current implementation just that we will do more iterations now given that we find a random core from the entire cpumask.

    unsigned int num_bits = 0;
    unsigned int idx = 0;
    while (idx < max_cpu_bitmask) {
        num_bits += count_ulong_bits_set(mask[idx]);
        idx++;
    }
    
    /* There should be at least one bit set as part of the cpu affinity mask otherwise host is
       malicious. */
    if (num_bits == 0) {
        ret = -EINVAL;
        goto out;
    }
    
    /* Generate a random number and use it to find a random bit set in the first non-empty
     * unsigned long of the cpu affinity mask. */
    unsigned long rand_num = 0;
    ret = DkRandomBitsRead(&rand_num, sizeof(rand_num));
    if (ret < 0) {
        ret = pal_to_unix_errno(ret);
        goto out;
    }
    
    unsigned int nth_setbit = rand_num % num_bits;
    
    /* At each iteration, find the lowest bit set in cpu mask and unset it; this will bring
     * us to the nth_setbit after nth_setbit iterations. */
    size_t i = 0;
    while (i < nth_setbit) {
        if (mask[mask_idx] == 0) {
            mask_idx++;
            /* TODO: check that we don't overflow mask */
            continue;
        }

        mask[mask_idx] = mask[mask_idx] & ~(1UL << __builtin_ctzl(mask[mask_idx]));
        i++;
    }
    
    unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) + BITS_IN_TYPE(unsigned long) * mask_idx;

LibOS/shim/test/regression/pthread_set_get_affinity.c line 37 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

sched_getcpu -> getcpu

Done.

LibOS/shim/test/regression/pthread_set_get_affinity.c line 39 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The formatters for cpu and node must be %u (not %d)

Yes, you are right. Fixed it.

mkow

Reviewed 1 of 3 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 260 at r2 (raw file):

        num_bits = count_ulong_bits_set(mask[idx]);
        if (num_bits)
            break;

If you removed this if, changed = above to += and added a two-line while in the loop below, then you'd get solution which properly picks a random CPU from the whole affinity mask. So, +/- same complexity and less weird behavior :)

LibOS/shim/src/sys/shim_sched.c line 262 at r2 (raw file):

            break;
        idx++;
     }

wrong indentation

LibOS/shim/src/sys/shim_sched.c line 265 at r2 (raw file):

    /* There should be at least one bit set as part of the cpu affinity mask otherwise host is
       malicious. */

Suggestion:

    /* There should be at least one bit set as part of the cpu affinity mask otherwise host is
     * doing something malicious. */

vijaydhanraj

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I forgot how our discussion ended in the previous version of this PR.

I don't think we had any conclusion. IIRC, we discussed two options:

Find the first non-empty cpumask and select a random bit set from it.

Count all the cores present (bit set) in the cpumask and then randomly set a CPU from it.

But why can't we simply choose a random CPU, and not the "random CPU in the first non-empty long"? Wouldn't it easier and more straight-forward, smth like this:

The code snippet below aligns with option 2. Yes, we can do it either way but I preferred option 1, as we can stop right after we find a non-empty cpumask as we just need to return a random CPU present in the thread's affinity and not keep iterating futher.

With the suggested code, I don't think we can randomly select any core from threads_cnt. We first need to find the cores that are part of the thread's affinity list and then select any random core from that. So, the code will look something like the below which is very similar to my current implementation just that we will do more iterations now given that we find a random core from the entire cpumask.
    unsigned int num_bits = 0;
    unsigned int idx = 0;
    while (idx < max_cpu_bitmask) {
        num_bits += count_ulong_bits_set(mask[idx]);
        idx++;
    }
    
    /* There should be at least one bit set as part of the cpu affinity mask otherwise host is
       malicious. */
    if (num_bits == 0) {
        ret = -EINVAL;
        goto out;
    }
    
    /* Generate a random number and use it to find a random bit set in the first non-empty
     * unsigned long of the cpu affinity mask. */
    unsigned long rand_num = 0;
    ret = DkRandomBitsRead(&rand_num, sizeof(rand_num));
    if (ret < 0) {
        ret = pal_to_unix_errno(ret);
        goto out;
    }
    
    unsigned int nth_setbit = rand_num % num_bits;
    
    /* At each iteration, find the lowest bit set in cpu mask and unset it; this will bring
     * us to the nth_setbit after nth_setbit iterations. */
    size_t i = 0;
    while (i < nth_setbit) {
        if (mask[mask_idx] == 0) {
            mask_idx++;
            /* TODO: check that we don't overflow mask */
            continue;
        }

        mask[mask_idx] = mask[mask_idx] & ~(1UL << __builtin_ctzl(mask[mask_idx]));
        i++;
    }
    
    unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) + BITS_IN_TYPE(unsigned long) * mask_idx; 

@mkow I think you are suggesting something like the above the code right?

LibOS/shim/src/sys/shim_sched.c line 260 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

If you removed this if, changed = above to += and added a two-line while in the loop below, then you'd get solution which properly picks a random CPU from the whole affinity mask. So, +/- same complexity and less weird behavior :)

Please see the other thread where I have added some code. I think you are referring to a similar approach.

dimakuv

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 258 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

For either approach (please see my response), we need count_ulong_bits_set() to count the number of cores that are part of the thread's affinity list. We then select any random core from that list.

You are right.

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

With the suggested code, I don't think we can randomly select any core from threads_cnt. We first need to find the cores that are part of the thread's affinity list and then select any random core from that.

Yes, sorry, you are right. I forgot that threads_cnt gives a number of all CPU logical cores. But what we need here is the number of possible CPU logical cores for this software thread (which is limited by the affinity list).

So yes, then your code snippet is more correct. I actually like your snippet, because as Michal puts it -- it has "less weird" behavior. So can we change to option 2 then?

mkow

Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

With the suggested code, I don't think we can randomly select any core from threads_cnt. We first need to find the cores that are part of the thread's affinity list and then select any random core from that.

Yes, sorry, you are right. I forgot that threads_cnt gives a number of all CPU logical cores. But what we need here is the number of possible CPU logical cores for this software thread (which is limited by the affinity list).

So yes, then your code snippet is more correct. I actually like your snippet, because as Michal puts it -- it has "less weird" behavior. So can we change to option 2 then?

@vijaydhanraj: The proposed snippet looks good overall, modulo:

The first while can now be shortened to a for loop.
This /* TODO: check that we don't overflow mask */ is not needed, it's not possible to overflow it.
The second while should also be a for loop.
The if in the second while should be while (right now this snippet is incorrect).

vijaydhanraj

Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

@vijaydhanraj: The proposed snippet looks good overall, modulo:

The first while can now be shortened to a for loop.

This /* TODO: check that we don't overflow mask */ is not needed, it's not possible to overflow it.

The second while should also be a for loop.

The if in the second while should be while (right now this snippet is incorrect).

Thanks, will switch to option 2 and send an updated patch.

vijaydhanraj

Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 286 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Thanks, will switch to option 2 and send an updated patch.

Done.

LibOS/shim/src/sys/shim_sched.c line 262 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

wrong indentation

Not applicable with the recent change.

LibOS/shim/src/sys/shim_sched.c line 265 at r2 (raw file):

    /* There should be at least one bit set as part of the cpu affinity mask otherwise host is
       malicious. */

Done.

mkow

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

    if (node) {
        size_t core_id = g_pal_public_state->topo_info.threads[cpu_current].core_id;

Security vuln here: cpu_current may be bigger than threads_cnt or point to an offline thread, because the affinity mask is just forwarded from the host. This will cause us to leak enclave memory to the attacker.
The core issue is that we don't emulate the affinity in LibOS/PAL and just blindly trust the host that it will return consistent information. Could you create another PR to fix it? It should be rather simple, we should keep the affinity inside LibOS and only additionally ask the host to synchronize the real one with the one held in LibOS structures.

vijaydhanraj

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

cpu_current may be bigger than threads_cnt or point to an offline thread, because the affinity mask is just forwarded from the host. This will cause us to leak enclave memory to the attacker.

Yes, you are right, good catch.

we should keep the affinity inside LibOS and only additionally ask the host to synchronize the real one with the one held in LibOS structures.

Are you suggesting we cache the affinity information in LibOS every time the application calls sched_setaffinity for a given PID and we return the cached affinity when the application calls sched_getaffinity for that PID? If not, can you please explain?

mkow

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

cpu_current may be bigger than threads_cnt or point to an offline thread, because the affinity mask is just forwarded from the host. This will cause us to leak enclave memory to the attacker.

Yes, you are right, good catch.

we should keep the affinity inside LibOS and only additionally ask the host to synchronize the real one with the one held in LibOS structures.

Are you suggesting we cache the affinity information in LibOS every time the application calls sched_setaffinity for a given PID and we return the cached affinity when the application calls sched_getaffinity for that PID? If not, can you please explain?

Yes. And we also need to do all the checking of the requested affinity to disallow the app to accidentally set affinity to an offline core (assuming this is also illegal on Linux, if it's legal, then we'll need a safeguard the code here).

vijaydhanraj

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):
Got it, thanks. Alternatively, can we add something like the below to address this issue? Since this is the only function where we directly use the output of DkThreadGetCpuAffinity() within LibOS.

diff --git a/LibOS/shim/src/sys/shim_sched.c b/LibOS/shim/src/sys/shim_sched.c
index 4e41a22a5..3b51a3a75 100644
--- a/LibOS/shim/src/sys/shim_sched.c
+++ b/LibOS/shim/src/sys/shim_sched.c
@@ -287,6 +287,14 @@ long shim_do_getcpu(unsigned* cpu, unsigned* node, struct getcpu_cache* unused)
     unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) +
                                BITS_IN_TYPE(unsigned long) * mask_idx;
 
+    /* Since the cpu affinity mask is untrusted (forwarded from host) ensure `cpu_current` is a
+     * valid core that is online. */
+    if (cpu_current >= threads_cnt ||
+            !g_pal_public_state->topo_info.threads[cpu_current].is_online) {
+        ret = -EINVAL;
+        goto out;
+    }
+
     if (cpu)
         *cpu = cpu_current;

disallow the app to accidentally set affinity to an offline core

Based on my quick check, I see that Linux sched get/set affinity calls fail if we try to affinitize application on an offlined core.

mkow

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @vijaydhanraj)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Got it, thanks. Alternatively, can we add something like the below to address this issue? Since this is the only function where we directly use the output of DkThreadGetCpuAffinity() within LibOS.
diff --git a/LibOS/shim/src/sys/shim_sched.c b/LibOS/shim/src/sys/shim_sched.c
index 4e41a22a5..3b51a3a75 100644
--- a/LibOS/shim/src/sys/shim_sched.c
+++ b/LibOS/shim/src/sys/shim_sched.c
@@ -287,6 +287,14 @@ long shim_do_getcpu(unsigned* cpu, unsigned* node, struct getcpu_cache* unused)
     unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) +
                                BITS_IN_TYPE(unsigned long) * mask_idx;
 
+    /* Since the cpu affinity mask is untrusted (forwarded from host) ensure `cpu_current` is a
+     * valid core that is online. */
+    if (cpu_current >= threads_cnt ||
+            !g_pal_public_state->topo_info.threads[cpu_current].is_online) {
+        ret = -EINVAL;
+        goto out;
+    }
+
     if (cpu)
         *cpu = cpu_current;
disallow the app to accidentally set affinity to an offline core

Based on my quick check, I see that Linux sched get/set affinity calls fail if we try to affinitize application on an offlined core.

No, please fix the issue at the core. IFying it here will only delay the moment when we accidentally do exactly the same somewhere else and introduce a security vuln.

vijaydhanraj

Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

No, please fix the issue at the core. IFying it here will only delay the moment when we accidentally do exactly the same somewhere else and introduce a security vuln.

Sure, will create another PR to address the affinity issue.

dimakuv

Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Sure, will create another PR to address the affinity issue.

I agree with @mkow. Let's create a proper "shadow" copy of the affinity mask for each thread, which will be stored inside the LibOS enclave memory and used for validation on get/setaffinity calls.

I assume the PR can do something like this:

Introduce a new field to shim_thread, something like uint64_t cpumask[32];. Here I opt for a static array that can hold 2048 CPUs -- I believe this is more than enough for any practical purposes. Static array is IMHO better than a pointer to a dynamically allocated object because checkpoint-restore logic will be much simpler (just copy the field from parent object to child object).
During setaffinity syscall, memorize the user-provided CPU mask into this shim_thread->cpumask.
During getaffinity syscall, validate that the returned-from-host (via DkThreadGetCpuAffinity) CPU mask is a subset (? smth like this) of shim_thread->cpumask. Here we need to carefully check the man pages on what is the correct behavior (e.g., what if the user-supplied CPU mask included an offlined CPU). Also, at this point, shim_thread->cpumask needs to be updated to the corrected value.
This getcpu syscall should also have a validation after DkThreadGetCpuAffinity. Probably we need a wrapper function around DkThreadGetCpuAffinity, so that we don't have duplicated code.

After such a PR is created, reviewed and merged, we can continue with this PR -- rebase on top of that one.

mkow

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

During getaffinity syscall, validate that the returned-from-host (via DkThreadGetCpuAffinity) CPU mask is a subset (? smth like this) of shim_thread->cpumask. Here we need to carefully check the man pages on what is the correct behavior (e.g., what if the user-supplied CPU mask included an offlined CPU). Also, at this point, shim_thread->cpumask needs to be updated to the corrected value.

No, getaffinity should just return the in-LibOS mask. And it's not "a cache" - from our LibOS perspective that is the true affinity.

This getcpu syscall should also have a validation after DkThreadGetCpuAffinity. Probably we need a wrapper function around DkThreadGetCpuAffinity, so that we don't have duplicated code.

Not if DkThreadGetCpuAffinity is handled entirely in LibOS (and probably removed from PAL API?).

dimakuv

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):
@mkow But Gramine itself can be run with smth like cpuset -c 0-3 gramine-sgx ..., which restricts the affinities of threads already, and is not reflected in /sys/ files. But it will be reflected in DkThreadGetCpuAffinity().

This way of running Gramine is used sometimes, especially for performance benchmarking purposes. If Gramine doesn't recognize this, we could end up with app inside Gramine having a totally wrong understanding of the schedulable cores.

Thinking about it... Maybe I'm wrong? But I see this in man setcpu:

This can result in these other calls [sched_setaffinity] returning an error, if for example, such a call ends up requesting an empty set of CPUs or memory nodes, after that request is restricted to the invoking process's cpuset.

Well, this needs some investigation... Why is Linux so hard.

mkow

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@mkow But Gramine itself can be run with smth like cpuset -c 0-3 gramine-sgx ..., which restricts the affinities of threads already, and is not reflected in /sys/ files. But it will be reflected in DkThreadGetCpuAffinity().

This way of running Gramine is used sometimes, especially for performance benchmarking purposes. If Gramine doesn't recognize this, we could end up with app inside Gramine having a totally wrong understanding of the schedulable cores.

Thinking about it... Maybe I'm wrong? But I see this in man setcpu:

This can result in these other calls [sched_setaffinity] returning an error, if for example, such a call ends up requesting an empty set of CPUs or memory nodes, after that request is restricted to the invoking process's cpuset.

Well, this needs some investigation... Why is Linux so hard.

Heh :)
Anyways, if it turns out we need special handling for this, then we only need to call DkThreadGetCpuAffinity() at the initialization and set the internal affinity to the result.

dimakuv

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Heh :)
Anyways, if it turns out we need special handling for this, then we only need to call DkThreadGetCpuAffinity() at the initialization and set the internal affinity to the result.

Yes, I agree with this (calling DkThreadGetCpuAffinity at LibOS initialization). Should be enough.

vijaydhanraj

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):
Thanks @mkow @dimakuv for the clarification. Is calling DkThreadGetCpuAffinity during LibOS initialization to determine the default affinity of the thread? If so, can't we set it to0xFFF..F meaning affinitized to run on all cores as cpuset takes higher precedence over setting the affinity programmatically? We can then remove the DkThreadGetCpuAffinity PAL API.

This can result in these other calls [sched_setaffinity] returning an error, if for example, such a call ends up requesting an empty set of CPUs or memory nodes, after that request is restricted to the invoking process's cpuset.

This I think is a user error and better not handled in Gramine. A legit user who restricts the application affinity using cpuset should be aware of setting the right affinity programmatically.

vijaydhanraj

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Thanks @mkow @dimakuv for the clarification. Is calling DkThreadGetCpuAffinity during LibOS initialization to determine the default affinity of the thread? If so, can't we set it to0xFFF..F meaning affinitized to run on all cores as cpuset takes higher precedence over setting the affinity programmatically? We can then remove the DkThreadGetCpuAffinity PAL API.

This can result in these other calls [sched_setaffinity] returning an error, if for example, such a call ends up requesting an empty set of CPUs or memory nodes, after that request is restricted to the invoking process's cpuset.

This I think is a user error and better not handled in Gramine. A legit user who restricts the application affinity using cpuset should be aware of setting the right affinity programmatically.

Had an offline chat with @dimakuv and did a few experiments to check if we need to store DkThreadGetCpuAffinity during LibOS initialization and it looks like we do need this.

First experiment: Get default affinity on a 128-core system.

Default thread affinity using pthread_getaffinity_np: cpumask[0] = 0xffffffffffffffff, cpumask[1] = 0xffffffffffffffff
Default thread affinity using sched_getaffinity: cpumask[0] = 0xffffffffffffffff, cpumask[1] = 0xffffffffffffffff

As we can see both pthread_getaffinity_np1 and sched_getaffinity show the thread affinitized to all the cores in the system.

Second experiment: Limit affinity using numactl -C 0-3 gramine-direct test-app

Default thread affinity using pthread_getaffinity_np: cpumask[0] = 0xf, cpumask[1] = 0x0
Default thread affinity using sched_getaffinity: cpumask[0] = 0xf, cpumask[1] = 0x0

Here we can see that both pthread_getaffinity_np and sched_getaffinity show the restrictive affinity set by numactl.

From the second experiment, we can conclude that DkThreadGetCpuAffinity needs to be called during LibOS initialization and the affinity stored else we might end up returning incorrect affinity to the user.

Found this https://serverfault.com/questions/625146/difference-between-taskset-and-cpuset useful to understand the difference between taskset vs cpuset.

mkow

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

If so, can't we set it to 0xFFF..F meaning affinitized to run on all cores as cpuset takes higher precedence over setting the affinity programmatically? We can then remove the DkThreadGetCpuAffinity PAL API.

No, we can't, because that's the very thing we want to prevent by this change - to disallow the affinity mask to contain non-existent CPU threads.

vijaydhanraj

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

If so, can't we set it to 0xFFF..F meaning affinitized to run on all cores as cpuset takes higher precedence over setting the affinity programmatically? We can then remove the DkThreadGetCpuAffinity PAL API.

No, we can't, because that's the very thing we want to prevent by this change - to disallow the affinity mask to contain non-existent CPU threads.

Sorry, what I meant was to have default affinity set during thread initialization which is all available processors after removing the offlined and non-existent cores. So, this will be a legit mask having all the available cores. But even if we manage to do this, it will not work as users can launch Gramine app with a specific affinity using numactland LibOS would be totally oblivious to this. So, I do agree that DkThreadGetCpuAffinity needs to be called during LibOS initialization and the affinity stored. This is what I had experimented and posted..

mkow

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Sorry, what I meant was to have default affinity set during thread initialization which is all available processors after removing the offlined and non-existent cores. So, this will be a legit mask having all the available cores. But even if we manage to do this, it will not work as users can launch Gramine app with a specific affinity using numactland LibOS would be totally oblivious to this. So, I do agree that DkThreadGetCpuAffinity needs to be called during LibOS initialization and the affinity stored. This is what I had experimented and posted..

Ok, makes more sense now ;)

vijaydhanraj

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

LibOS/shim/src/sys/shim_sched.c line 294 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Ok, makes more sense now ;)

@mkow @dimakuv Created PR #631 to address the issue. Can you please take a look?

dimakuv · 2022-08-02T17:58:40Z

@vijaydhanraj We should rebase this PR, right?

vijaydhanraj · 2022-08-02T18:16:27Z

@dimakuv Yes correct. Working on it.

vijaydhanraj

Rebased and forced pushed the changes. PTAL.

Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)

dimakuv

Reviewed 6 of 6 files at r4, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow and @vijaydhanraj)

libos/src/sys/libos_sched.c line 254 at r4 (raw file):

        return -EFAULT;

    /* Allocate memory to hold the thread's cpu affinity mask. */

to hold the -> to hold the copy of the

libos/src/sys/libos_sched.c line 261 at r4 (raw file):

    lock(&thread->lock);
    memcpy(mask, thread->cpu_affinity_mask, GET_CPU_MASK_LEN() * sizeof(unsigned long));

sizeof(unsigned long) -> sizeof(thread->cpu_affinity_mask)

libos/src/sys/libos_sched.c line 275 at r4 (raw file):

        ret = -EINVAL;
        goto out;
    }

num_bits == 0 shouldn't be possible anymore. There will always be at least one bit set in the cpu affinity mask. Please remove this check (+ the comment) and do simply:

assert(num_bits);

libos/src/sys/libos_sched.c line 300 at r4 (raw file):

    unsigned int cpu_current = __builtin_ctzl(mask[mask_idx]) +
                               BITS_IN_TYPE(unsigned long) * mask_idx;

Sorry, I know that we've discussed this so many times, but I still find it extremely hard to understand your logic. Let's rewrite it to be simpler. This is the 10-th time I look at this code and I get lost.

I suggest an algorithm like this:

/* we know num_bits and rand_num */
cpu_bit_idx = rand_num % num_bits;

mask_idx = 0;
while (cpu_bit_idx > count_ulong_bits_set(mask[idx])) {
    cpu_bit_idx -= count_ulong_bits_set(mask[idx]);
    mask_idx++;
}

cpu_idx = 0;
for (; cpu_idx < BITS_IN_TYPE(__typeof__(*mask)); cpu_idx++) {
    if (mask[mask_idx] & (1ul << cpu_idx)) {
        if (!cpu_bit_idx)
            break;
        cpu_bit_idx--;
    }
}

cpu_current = BITS_IN_TYPE(__typeof__(*mask)) * mask_idx + cpu_idx;

Something like this.

vijaydhanraj

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, and @vijaydhanraj)

libos/test/regression/pthread_set_get_affinity.c line 105 at r6 (raw file):

In step 1 though, the number of threads-to-spawn must be limited via MANIFEST_SGX_THREAD_CNT - (INTERNAL_THREAD_CNT + MAIN_THREAD_CNT), same as now.

Yes, will keep this same as now.

boryspoplawski

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, and @vijaydhanraj)

libos/test/regression/pthread_set_get_affinity.c line 105 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

In step 1 though, the number of threads-to-spawn must be limited via MANIFEST_SGX_THREAD_CNT - (INTERNAL_THREAD_CNT + MAIN_THREAD_CNT), same as now.

Yes, will keep this same as now.

On a high level looks good.

vijaydhanraj

Reviewable status: 4 of 6 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @mkow)

libos/src/sys/libos_sched.c line 278 at r7 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

forgot about this

Done.

libos/test/regression/pthread_set_get_affinity.c line 49 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Got it, will add the check.

Removed this shifting logic and used CPU_* MACROs.

libos/test/regression/pthread_set_get_affinity.c line 105 at r6 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

On a high level looks good.

Done.

libos/test/regression/pthread_set_get_affinity.c line 48 at r7 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

This is not cpumask

Done.

libos/test/regression/pthread_set_get_affinity.c line 109 at r7 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

There is no point in wrapping this in a struct, you can pass it directly.

Done.

boryspoplawski

Reviewed 2 of 2 files at r8, all commit messages.
Reviewable status: all files reviewed, 14 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

libos/test/regression/pthread_set_get_affinity.c line 39 at r8 (raw file):

    cpu_set_t* thread_cpuaffinity = (cpu_set_t*)args;
    cpu_set_t cpumask;

This needs to be cleared first.

libos/test/regression/pthread_set_get_affinity.c line 42 at r8 (raw file):

    CPU_SET(cpu, &cpumask);
    CPU_AND(&cpumask, &cpumask, thread_cpuaffinity);
    if (!CPU_COUNT(&cpumask)) {

Suggestion:

    if (!CPU_ISSET(cpu, thread_cpuaffinity)) {

libos/test/regression/pthread_set_get_affinity.c line 64 at r8 (raw file):

        CPU_OR(set_cpumask, set_cpumask, online_cpumask);
        return 0;
    }

All of this is not really useful, please remove it.

Code quote:

    int online_cores_from_mask = CPU_COUNT(online_cpumask);
    if (online_cores_from_mask == 0)
        return -1;

    CPU_ZERO(set_cpumask);
    if (online_cores_from_mask <= 2) {
        CPU_OR(set_cpumask, set_cpumask, online_cpumask);
        return 0;
    }

libos/test/regression/pthread_set_get_affinity.c line 107 at r8 (raw file):

     * also set MANIFEST_SGX_THREAD_CNT to the same value.
     */
    long numthreads = min(online_cores, (MANIFEST_SGX_THREAD_CNT -

Suggestion:

size_t

libos/test/regression/pthread_set_get_affinity.c line 112 at r8 (raw file):

    /* Each thread will be affinitized to run on 2 distinct cores. So reduce the number of threads
     * to half of cores. */
    numthreads = (numthreads >= 2) ? numthreads/2 : 1;

Suggestion:

numthreads = MAX(numthreads / 2, 1);

libos/test/regression/pthread_set_get_affinity.c line 124 at r8 (raw file):

    }

    /* Validate parent set/get affinity for child */

Please move this comment before loop, or even remove it (it's not useful).

libos/test/regression/pthread_set_get_affinity.c line 127 at r8 (raw file):

    cpu_set_t* set_cpumask = malloc(numthreads * sizeof(*set_cpumask));
    if (!set_cpumask) {
        free(threads);

Actually, please remove all these free which are before errx - they are pointless anyway and just bloat the code.

libos/test/regression/pthread_set_get_affinity.c line 132 at r8 (raw file):

    cpu_set_t get_cpumask;
    for (long i = 0; i < numthreads; i++) {

Suggestion:

size_t

libos/test/regression/pthread_set_get_affinity.c line 141 at r8 (raw file):

        }

        ret = pthread_create(&threads[i], NULL, dowork, (void*)&set_cpumask[i]);

Suggestion:

&set_cpumask[i]

libos/test/regression/pthread_set_get_affinity.c line 163 at r8 (raw file):

        }

        if (!CPU_EQUAL_S(sizeof(set_cpumask[i]), &set_cpumask[i], &get_cpumask)) {

Suggestion:

if (!CPU_EQUAL(&set_cpumask[i], &get_cpumask)) {

vijaydhanraj

Reviewable status: 5 of 6 files reviewed, 14 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, and @mkow)

libos/test/regression/pthread_set_get_affinity.c line 39 at r8 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

This needs to be cleared first.

Using CPU_ISSET as suggested so this is not applicable as the variable itself is removed.

libos/test/regression/pthread_set_get_affinity.c line 42 at r8 (raw file):

    CPU_SET(cpu, &cpumask);
    CPU_AND(&cpumask, &cpumask, thread_cpuaffinity);
    if (!CPU_COUNT(&cpumask)) {

Yes, this is better. Done.

libos/test/regression/pthread_set_get_affinity.c line 64 at r8 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

All of this is not really useful, please remove it.

Done.

libos/test/regression/pthread_set_get_affinity.c line 107 at r8 (raw file):

     * also set MANIFEST_SGX_THREAD_CNT to the same value.
     */
    long numthreads = min(online_cores, (MANIFEST_SGX_THREAD_CNT -

Done.

libos/test/regression/pthread_set_get_affinity.c line 112 at r8 (raw file):

    /* Each thread will be affinitized to run on 2 distinct cores. So reduce the number of threads
     * to half of cores. */
    numthreads = (numthreads >= 2) ? numthreads/2 : 1;

Done.

libos/test/regression/pthread_set_get_affinity.c line 124 at r8 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Please move this comment before loop, or even remove it (it's not useful).

Done.

libos/test/regression/pthread_set_get_affinity.c line 127 at r8 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Actually, please remove all these free which are before errx - they are pointless anyway and just bloat the code.

Done.

libos/test/regression/pthread_set_get_affinity.c line 132 at r8 (raw file):

    cpu_set_t get_cpumask;
    for (long i = 0; i < numthreads; i++) {

Done.

libos/test/regression/pthread_set_get_affinity.c line 141 at r8 (raw file):

        }

        ret = pthread_create(&threads[i], NULL, dowork, (void*)&set_cpumask[i]);

Done.

libos/test/regression/pthread_set_get_affinity.c line 163 at r8 (raw file):

        }

        if (!CPU_EQUAL_S(sizeof(set_cpumask[i]), &set_cpumask[i], &get_cpumask)) {

Done.

dimakuv

Reviewed 1 of 2 files at r8, 1 of 1 files at r9, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @mkow)

libos/test/regression/pthread_set_get_affinity.c line 16 at r9 (raw file):

#include <stdio.h>
#include <stdlib.h>
#include <sys/param.h>

FYI: I like this new version of the test!

boryspoplawski

Reviewed 1 of 1 files at r9, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @mkow)

boryspoplawski · 2022-08-25T11:31:12Z

Jenkins, test this please

boryspoplawski

Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @mkow)

boryspoplawski · 2022-08-25T19:20:03Z

Jenkins, test this please

dimakuv

Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @mkow)

common/src/ubsan.c line 130 at r10 (raw file):

                       "null pointer returned from function declared to never return null")
UBSAN_SIMPLE_HANDLER_1(invalid_builtin,
                       "passing invalid argument to builtin %ld")

FYI: for history, Vijay added this handler because otherwise UBSan complained like this:

gramine/build/../libos/src/sys/libos_sched.c:283: undefined reference to `__ubsan_handle_invalid_builtin_abort'

This happens on this PR because the pthread_get_set_affinity LibOS regression test now uses __builtin_ctzl() GCC builtin function. And this function has special treatment in UBSan, see https://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20170724/199351.html

So by adding this builtin, we glue our UBSan implementation with the common UBSan library, and when UBSan at compile-time looks at the pthread_get_set_affinity code, it can correctly instrument the __builtin_ctzl() function and add a check with a call to __ubsan_handle_invalid_builtin_abort(). (At least that's my general understanding of how it works, details may diverge.)

mkow

Reviewed 4 of 6 files at r4, 1 of 2 files at r8, 1 of 1 files at r9, 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)

libos/test/regression/pthread_set_get_affinity.c line 27 at r10 (raw file):

pthread_barrier_t barrier;

static void* dowork(void* args) {

(it was like this before, but please fix)

Suggestion:

do_work

libos/test/regression/pthread_set_get_affinity.c line 41 at r10 (raw file):

    if (!CPU_ISSET(cpu, thread_cpuaffinity)) {
        errx(EXIT_FAILURE, "cpu = %d is not part of thread %ld affinity mask", cpu,
                            syscall(SYS_gettid));

wrong alignment

libos/test/regression/pthread_set_get_affinity.c line 94 at r10 (raw file):

     * also set MANIFEST_SGX_THREAD_CNT to the same value.
     */
    size_t numthreads = MIN(online_cores, (MANIFEST_SGX_THREAD_CNT -

We usually move the trailing operator to the next line

libos/test/regression/pthread_set_get_affinity.c line 117 at r10 (raw file):

    cpu_set_t get_cpumask;
    for (size_t i = 0; i < numthreads; i++) {
        /* Find cores that will be affinitized to this thread. */

Suggestion:

Select cores that will be affinitized to this thread.

libos/test/regression/pthread_set_get_affinity.c line 120 at r10 (raw file):

        ret = select_thread_cpu_affinity(&set_cpumask[i], &online_cpumask);
        if (ret < 0) {
            errx(EXIT_FAILURE, "Cannot find cores to affinitize threads");

Suggestion:

Cannot select cores to affinitize threads

vijaydhanraj

Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)

libos/test/regression/pthread_set_get_affinity.c line 27 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

(it was like this before, but please fix)

Done.

libos/test/regression/pthread_set_get_affinity.c line 41 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

wrong alignment

I have aligned to matching parenthesis. If this is incorrect, can you please suggest it? I am not too sure about this.

libos/test/regression/pthread_set_get_affinity.c line 94 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

We usually move the trailing operator to the next line

Thanks, done.

libos/test/regression/pthread_set_get_affinity.c line 117 at r10 (raw file):

    cpu_set_t get_cpumask;
    for (size_t i = 0; i < numthreads; i++) {
        /* Find cores that will be affinitized to this thread. */

Done.

libos/test/regression/pthread_set_get_affinity.c line 120 at r10 (raw file):

        ret = select_thread_cpu_affinity(&set_cpumask[i], &online_cpumask);
        if (ret < 0) {
            errx(EXIT_FAILURE, "Cannot find cores to affinitize threads");

Done.

mkow

Reviewed 1 of 1 files at r11, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners

boryspoplawski

Reviewed 1 of 1 files at r11, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners

dimakuv · 2022-09-06T14:48:55Z

Jenkins, test this please

dimakuv

Reviewed 1 of 1 files at r11, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, "fixup! " found in commit messages' one-liners

boryspoplawski

Reviewed all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

This commit adds an approx. implementation of `getcpu` syscall that returns a random set bit from cpu affinity mask associated with calling thread and its corresponding NUMA node. `pthread_set_get_affinity` regression test is extended to validate this implementation. This commit also enables the `getcpu01` ltp test for non-SGX case. Signed-off-by: Vijay Dhanraj <[email protected]>

dimakuv

Reviewed all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved

dimakuv · 2022-09-06T18:02:07Z

Jenkins, test this please

dimakuv suggested changes May 30, 2022

View reviewed changes

vijaydhanraj commented May 30, 2022

View reviewed changes

mkow requested changes May 30, 2022

View reviewed changes

vijaydhanraj commented May 31, 2022

View reviewed changes

dimakuv suggested changes May 31, 2022

View reviewed changes

mkow reviewed May 31, 2022

View reviewed changes

vijaydhanraj commented May 31, 2022

View reviewed changes

mkow requested changes May 31, 2022

View reviewed changes

vijaydhanraj commented May 31, 2022

View reviewed changes

mkow reviewed May 31, 2022

View reviewed changes

vijaydhanraj commented May 31, 2022

View reviewed changes

mkow reviewed Jun 1, 2022

View reviewed changes

vijaydhanraj commented Jun 1, 2022

View reviewed changes

dimakuv suggested changes Jun 1, 2022

View reviewed changes

mkow reviewed Jun 1, 2022

View reviewed changes

dimakuv suggested changes Jun 1, 2022

View reviewed changes

mkow reviewed Jun 1, 2022

View reviewed changes

dimakuv suggested changes Jun 1, 2022

View reviewed changes

vijaydhanraj commented Jun 1, 2022

View reviewed changes

mkow reviewed Jun 1, 2022

View reviewed changes

vijaydhanraj commented Jun 2, 2022

View reviewed changes

mkow reviewed Jun 2, 2022

View reviewed changes

vijaydhanraj commented Jun 3, 2022

View reviewed changes

vijaydhanraj force-pushed the vdhanraj/add_getcpu_sycall_support branch from c215a36 to 204f361 Compare August 4, 2022 01:38

vijaydhanraj commented Aug 4, 2022

View reviewed changes

dimakuv suggested changes Aug 4, 2022

View reviewed changes

vijaydhanraj commented Aug 24, 2022

View reviewed changes

boryspoplawski reviewed Aug 24, 2022

View reviewed changes

vijaydhanraj commented Aug 24, 2022

View reviewed changes

boryspoplawski requested changes Aug 24, 2022

View reviewed changes

vijaydhanraj commented Aug 25, 2022

View reviewed changes

dimakuv previously approved these changes Aug 25, 2022

View reviewed changes

boryspoplawski previously approved these changes Aug 25, 2022

View reviewed changes

vijaydhanraj dismissed stale reviews from boryspoplawski and dimakuv via 405e2ce August 25, 2022 18:41

boryspoplawski previously approved these changes Aug 25, 2022

View reviewed changes

dimakuv previously approved these changes Aug 26, 2022

View reviewed changes

mkow requested changes Sep 6, 2022

View reviewed changes

vijaydhanraj dismissed stale reviews from dimakuv and boryspoplawski via dd713c4 September 6, 2022 14:09

vijaydhanraj commented Sep 6, 2022

View reviewed changes

mkow previously approved these changes Sep 6, 2022

View reviewed changes

boryspoplawski previously approved these changes Sep 6, 2022

View reviewed changes

dimakuv previously approved these changes Sep 6, 2022

View reviewed changes

boryspoplawski dismissed stale reviews from dimakuv, mkow, and themself via c28409c September 6, 2022 17:02

boryspoplawski force-pushed the vdhanraj/add_getcpu_sycall_support branch from dd713c4 to c28409c Compare September 6, 2022 17:02

boryspoplawski approved these changes Sep 6, 2022

View reviewed changes

dimakuv approved these changes Sep 6, 2022

View reviewed changes

boryspoplawski merged commit c28409c into gramineproject:master Sep 6, 2022

[LibOS] Add support for getcpu syscall #609

[LibOS] Add support for getcpu syscall #609

Conversation

vijaydhanraj commented May 26, 2022 • edited by dimakuv Loading

Description of the changes

How to test this PR?

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv commented Aug 2, 2022

vijaydhanraj commented Aug 2, 2022

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

boryspoplawski left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

boryspoplawski left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

boryspoplawski left a comment

Choose a reason for hiding this comment

boryspoplawski commented Aug 25, 2022

boryspoplawski left a comment

Choose a reason for hiding this comment

boryspoplawski commented Aug 25, 2022

[LibOS] Add support for `getcpu` syscall #609

[LibOS] Add support for `getcpu` syscall #609

vijaydhanraj commented May 26, 2022 •

edited by dimakuv

Loading