-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pal/Linux-SGX] Pre-allocate more PAL memory on huge manifest files #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
Pal/src/host/Linux-SGX/db_main.c, line 686 at r1 (raw file):
* * FIXME: this is a quick hack, we need proper memory allocation in PAL. */ if (manifest_size > 10 * 1024 * 1024) {
log_warning
about that? "This is a huge manifest, preallocating 128MB of internal memory"
Pal/src/host/Linux-SGX/db_main.c, line 723 at r1 (raw file):
if (pal_internal_mem_size < g_pal_internal_mem_size) { log_error("Too small `loader.pal_internal_mem_size`, need at least %lu bytes",
This will be confusing for the user. How about something like "Too small loader.pal_internal_mem_size
, need at least %lu bytes because the manifest is large"?
Also, maybe add %lu bytes (%luMB)
since the user is likely to use megabytes in the manifest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
Pal/src/host/Linux-SGX/db_main.c, line 686 at r1 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
log_warning
about that? "This is a huge manifest, preallocating 128MB of internal memory"
Done. Used log_always
because log_warning
will not be output (default log level is ERROR, and we didn't yet read the manifest options), and because log_error
doesn't fit here logically.
Pal/src/host/Linux-SGX/db_main.c, line 723 at r1 (raw file):
Previously, pwmarcz (Paweł Marczewski) wrote…
This will be confusing for the user. How about something like "Too small
loader.pal_internal_mem_size
, need at least %lu bytes because the manifest is large"?Also, maybe add
%lu bytes (%luMB)
since the user is likely to use megabytes in the manifest?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @pwmarcz)
a discussion (no related file):
FYI: I accidentally commited my local changes to bash/manifest.template
in one of the previous fixup commits. Removed it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @pwmarcz)
a discussion (no related file):
We went ahead and tested the PR where like Dmitrii mentioned it prompts the user to add the loader.pal_internal_mem_size
relevant for the workload in the manifest file and the test passes post that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I went through your testing procedure manually and it works for me.
Reviewed 1 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I request a review from @boryspoplawski or @mkow ? I wonder if they maybe come up with something better than this hack...
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @jinengandhi-intel)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
We went ahead and tested the PR where like Dmitrii mentioned it prompts the user to add the
loader.pal_internal_mem_size
relevant for the workload in the manifest file and the test passes post that.
Thanks for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you meant 18MB
not GB
:)
Reviewed 1 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @jinengandhi-intel)
Pal/src/host/Linux-SGX/db_main.c, line 724 at r3 (raw file):
} if (pal_internal_mem_size < g_pal_internal_mem_size) {
So if we do not specify loader.pal_internal_mem_size
(default is 0
) and the manifest is big, this will trigger and exit. Is that intended?
Pal/src/host/Linux-SGX/db_main.c, line 729 at r3 (raw file):
ocall_exit(1, true); } g_pal_internal_mem_size = pal_internal_mem_size;
How does it work? In normal/default case this will be 0
(just like in the old version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look that bad. A bit hacky, but I don't see a better way, without major rework of manifest handling.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @jinengandhi-intel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the description in the PR. Sure it is MB
.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @jinengandhi-intel)
Pal/src/host/Linux-SGX/db_main.c, line 724 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
So if we do not specify
loader.pal_internal_mem_size
(default is0
) and the manifest is big, this will trigger and exit. Is that intended?
This will not trigger. Because loader.pal_internal_mem_size == g_pal_internal_mem_size == 0
, and this IF clause is a strict <
Pal/src/host/Linux-SGX/db_main.c, line 729 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
How does it work? In normal/default case this will be
0
(just like in the old version).
That was exactly my intention -- for normal cases (not-too-big manifest), the behavior is like it was before.
Is your question -- how do we have any PAL internal memory if g_pal_internal_mem_size == 0
? This is because this g_pal_internal_mem_size
variable denotes the dynamically allocated PAL memory. And we always have the additional *statically allocated 64MB of memory in the form of a BSS global variable g_mem_pool
.
So the default behavior (when loader.pal_internal_mem_size == 0
) is to only use 64MB of g_mem_pool
. This is even described in our ReadTheDocs manifest-syntax page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @jinengandhi-intel)
Pal/src/host/Linux-SGX/db_main.c, line 724 at r3 (raw file):
and the manifest is big
By this I meant g_pal_internal_mem_size > 0
(and default loader.pal_internal_mem_size == 0
)
Pal/src/host/Linux-SGX/db_main.c, line 729 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
That was exactly my intention -- for normal cases (not-too-big manifest), the behavior is like it was before.
Is your question -- how do we have any PAL internal memory if
g_pal_internal_mem_size == 0
? This is because thisg_pal_internal_mem_size
variable denotes the dynamically allocated PAL memory. And we always have the additional *statically allocated 64MB of memory in the form of a BSS global variableg_mem_pool
.So the default behavior (when
loader.pal_internal_mem_size == 0
) is to only use 64MB ofg_mem_pool
. This is even described in our ReadTheDocs manifest-syntax page.
Thanks for explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), "fixup! " found in commit messages' one-liners (waiting on @boryspoplawski and @jinengandhi-intel)
Pal/src/host/Linux-SGX/db_main.c, line 724 at r3 (raw file):
Previously, boryspoplawski (Borys Popławski) wrote…
and the manifest is big
By this I meant
g_pal_internal_mem_size > 0
(and defaultloader.pal_internal_mem_size == 0
)
Ah, sorry, misunderstood the first time.
Yes, this is intended -- your scenario would mean that the user fed a big manifest to Gramine, and Gramine decided to allocate 64/128 MB of additional PAL memory. Since this "allocation of additional PAL memory" is controlled by the loader.pal_internal_mem_size
manifest option, Gramine cannot just silently continue working. Otherwise what does it mean? The user will think that Gramine is not using additional PAL memory, and if the user will decide later at some point to set loader.pal_internal_mem_size = 32M
, how should Gramine react?
To remove this ambiguity, I added this check and I force the user to specify loader.pal_internal_mem_size
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @jinengandhi-intel)
Pal/src/host/Linux-SGX/db_main.c, line 724 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ah, sorry, misunderstood the first time.
Yes, this is intended -- your scenario would mean that the user fed a big manifest to Gramine, and Gramine decided to allocate 64/128 MB of additional PAL memory. Since this "allocation of additional PAL memory" is controlled by the
loader.pal_internal_mem_size
manifest option, Gramine cannot just silently continue working. Otherwise what does it mean? The user will think that Gramine is not using additional PAL memory, and if the user will decide later at some point to setloader.pal_internal_mem_size = 32M
, how should Gramine react?To remove this ambiguity, I added this check and I force the user to specify
loader.pal_internal_mem_size
explicitly.
Sounds good
The tomlc99 library to parse TOML files uses a lot of memory during `toml_parse()`. Sometimes apps have very large manifest files (>10MB), and parsing these manifest files requires more than 128MB of preallocated memory pool in Linux-SGX PAL. Since this function is called before PAL can read `loader.pal_internal_mem_size`, there is no way to learn the additional memory size for PAL from the manifest. Thus we use a heuristic to pre-allocate some PAL memory before `toml_parse()`. Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
f52e492
fbbd9af
to
f52e492
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jinengandhi-intel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jinengandhi-intel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r2, 1 of 1 files at r4, all commit messages.
Dismissed @jinengandhi-intel from a discussion.
Reviewable status: complete! all files reviewed, all discussions resolved
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Thanks for this!
@jinengandhi-intel: You put a blocking comment, but from its contents it seems that wasn't intended, so I'm dismissing it.
Description of the changes
The tomlc99 library to parse TOML files uses a lot of memory during
toml_parse()
. Sometimes apps have very large manifest files (>10MB), and parsing these manifest files requires more than 64MB of preallocated memory pool in Linux-SGX PAL.Since this function is called before PAL can read
loader.pal_internal_mem_size
, there is no way to learn the additional memory size for PAL from the manifest. Thus we use a heuristic to pre-allocate some PAL memory beforetoml_parse()
.I understand that this fix is... dubious, but couldn't come up with something better.
Fixes gramineproject/graphene#2680
How to test this PR?
Try with huge manifest files. E.g., I did this in our Bash example:
This resulted in an 18MB
bash.manifest.sgx
file. Without this PR, Graphene fails attoml_parse()
step:(Increasing
loader.pal_internal_mem_size
doesn't help becausetoml_parse()
fails before being able to read this manifest option.)With this PR and without
loader.pal_internal_mem_size = "128M"
, Graphene correctly complains:Finally, with this PR and with
loader.pal_internal_mem_size = "128M"
, Graphene works:This change is