Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known issues for production deployment #7

Open
13 of 14 tasks
mkow opened this issue May 29, 2020 · 10 comments
Open
13 of 14 tasks

Known issues for production deployment #7

mkow opened this issue May 29, 2020 · 10 comments
Labels
documentation Improvements or additions to documentation P: 2

Comments

@mkow
Copy link
Member

mkow commented May 29, 2020

This issue lists items that need to be kept in mind as you consider using Graphene in a production deployment scenario.

Issues: (checked means "already fixed on master")

  • Fix all known security issues
    More information: Known security issues #8.
  • Documenting possible misuses of Graphene and its limitations
    Graphene has some limitations (some depends one the backends, e.g. that under SGX you can't get trusted time) and users should be aware of them.
  • Rewriting old, buggy and unstable subsystems:
  • Support for upstreamed SGX driver for Linux
    Upstreaming is still in progress, we're blocked on this. SGX support made its way to Linux 5.11 and Graphene supports it.
  • Removal of Graphene SGX driver (done in [Pal/Linux-SGX] Get rid of sgx driver submodule graphene#1997)
    This driver is insecure and dangerous (see its README) and is only a temporary solution. We will drop it once FSGSBASE patches are upstreamed (that's the only functionality currently left in the driver).
  • Logging system + consistent output format
    Currently all subsystems output logs in totally random fashion. We also need a better way to control log level.
  • Splitting Graphene output from app output, same for error codes (partially done)
    Currently those are mixed, which makes the output not really useful in production setup.
    Update: Graphene logs can now be redirected to a separate file in the manifest. App stdout and stderr are currently printed to the same host fd and the error codes are still "ANDed", but that's is probably not a blocker for production deployments.
  • Protected filesystem
    First version almost done, see [Linux-SGX] Add protected files implementation (SGX SDK file format) graphene#1325. Required for most production use-cases.
  • Protected argv and env
    Using argv and environment from the untrusted world may easily lead to TEE compromise. See Provide a way to pass verified argv and environments graphene#508.
@woju woju pinned this issue May 29, 2020
@debin-yang
Copy link

debin-yang commented Sep 28, 2020

Support Basic file locking function support which is required to enable Spark with Graphene #437

@mkow
Copy link
Member Author

mkow commented Sep 28, 2020

@debin-yang This issue is only to aggregate general issues with Graphene which block it from being used in production for all purposes, not just specific workloads.

@yamahata yamahata unpinned this issue Nov 10, 2020
@yamahata yamahata pinned this issue Nov 10, 2020
@dimakuv
Copy link
Contributor

dimakuv commented Mar 1, 2021

ELF parsing/loading

Isn't this done already? @pwmarcz @mkow .

Support for upstreamed SGX driver for Linux

This is done I think. See gramineproject/graphene#2084 (for Graphene proper) and gramineproject/graphene#2165 (for GSC).

@pwmarcz
Copy link
Contributor

pwmarcz commented Mar 1, 2021

ELF parsing/loading

Here's where we are:

  • LibOS: remove dynamic linking (this simplifies LibOS rtld code greatly, and fixes some bugs)
  • LibOS: refactor or rewrite remaining rtld code (at least get rid of gotos between loops)
  • PAL: either remove dynamic linking (pre-link PAL and LibOS before running), or rewrite it (based on musl)

The problems in PAL code are perhaps less harmful, because it's used only for loading PAL and LibOS binaries. However, I recall running into issues at least once (the relocation code crashing on CFI directives for hardcoded return address).

Somewhat related: fix linking of Graphene binaries to enable use of normal inline and LTO (see gramineproject/graphene#2179).

EDIT: The second checkbox is also done now.

EDIT: @dimakuv rewrote PAL dynamic linking, so we're done here.

@mkow
Copy link
Member Author

mkow commented Mar 1, 2021

This is done I think. See gramineproject/graphene#2084 (for Graphene proper) and gramineproject/graphene#2165 (for GSC).

Marked as done.

@mkow mkow changed the title Production blockers Known issues for production deployment Jul 5, 2021
@mkow mkow transferred this issue from gramineproject/graphene Sep 9, 2021
@mkow mkow pinned this issue Sep 9, 2021
@dimakuv
Copy link
Contributor

dimakuv commented Nov 25, 2021

@mkow Do we still want to keep this meta-issue open? There is one item left ("Documenting possible misuses of Graphene and its limitations"), and we don't have immediate plans to write a document like this.

@pwmarcz Could you mark your todo item ("rewrite db_rtld in PAL") as solved, after I submitted my PRs on this?

@dimakuv dimakuv added documentation Improvements or additions to documentation P: 2 labels Nov 25, 2021
@pwmarcz
Copy link
Contributor

pwmarcz commented Nov 25, 2021

Done.

@mkow
Copy link
Member Author

mkow commented Nov 25, 2021

@dimakuv: I'd keep it and in the meantime try to write up at least a short "secure deployment guidelines" doc, with all the dangers we are aware of clearly listed.

sammyne pushed a commit to sammyne/gramine that referenced this issue Nov 21, 2022
@dimakuv
Copy link
Contributor

dimakuv commented Mar 9, 2023

Looks like the only thing left is this: Documenting possible misuses of Graphene and its limitations

@mkow Can we consider #1194 as fixing it? If yes, then I can add "Fixes 7" to my PR, and we'll automatically close this issue.

@mkow
Copy link
Member Author

mkow commented Mar 13, 2023

Nope, this one is completely different? Your document is describing current Gramine state and limitations from the compatibility point of view, the one here is about security. Although reading it now I think I should have describe it better...
Anyways, I have a draft prepared already, need to finish it up finally.

@monavij monavij unpinned this issue May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation P: 2
Projects
None yet
Development

No branches or pull requests

4 participants