Skip to content

Wasm: cache Wasm OCI image permission check results#5358

Merged
arkodg merged 8 commits intoenvoyproxy:mainfrom
zhaohuabing:fix-5326
Mar 7, 2025
Merged

Wasm: cache Wasm OCI image permission check results#5358
arkodg merged 8 commits intoenvoyproxy:mainfrom
zhaohuabing:fix-5326

Conversation

@zhaohuabing
Copy link
Member

@zhaohuabing zhaohuabing commented Feb 26, 2025

This PR introduce a caching mechanism for the Wasm OCI image permission check results.

Background:
Previously, to prevent unauthorized EEPs from accessing a private Wasm OCI image, EG would verify the pullSecrect against the OCI registry in the Gateway API translator, even if the OCI image has already been pulled and cached locally. Because this verification involved network access, it could block the translator, especially when a lots of endpoints are scaling up/down.

What's changed:
This PR adds a permission cache in the Wasm package. A backgroud goroutine perodically checks the EEP's pullSecret against OCI registry and cache the check result. The Gateway API translator can now use the cached permission check result to tell if an EEP can access the OCI image.

  • Initial Image Pull: The first attempt to pull an OCI image will still block because neither the file cache nor the permission cache has any entry yet.
  • Subsequent Requests: After the initial pull, subsequent requests for the same image will not block because the permission check is served from the cache.

Cache expiry:

  • The cached permissions are rechecked every hour in case the registry invalidates an existing pullSecret.
  • If a chaced entry is not accessed in 24 hours, it will be removed from the cache.

These default values are now fixed consts, they can be exposed to EG configuration later if needed.

fix: #5326

@zhaohuabing zhaohuabing requested a review from a team as a code owner February 26, 2025 11:44
@zhaohuabing zhaohuabing marked this pull request as draft February 26, 2025 11:44
@codecov
Copy link

codecov bot commented Feb 26, 2025

Codecov Report

Attention: Patch coverage is 78.37838% with 32 lines in your changes missing coverage. Please review.

Project coverage is 65.17%. Comparing base (1c0eca6) to head (f935443).
Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
internal/wasm/cache.go 38.46% 14 Missing and 2 partials ⚠️
internal/wasm/premissioncache.go 86.88% 14 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5358      +/-   ##
==========================================
+ Coverage   65.07%   65.17%   +0.10%     
==========================================
  Files         213      214       +1     
  Lines       33588    33723     +135     
==========================================
+ Hits        21857    21979     +122     
- Misses      10402    10415      +13     
  Partials     1329     1329              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zhaohuabing zhaohuabing force-pushed the fix-5326 branch 3 times, most recently from 761fe56 to d6e47b2 Compare March 1, 2025 02:58
@zhaohuabing zhaohuabing changed the title add TTL for wasm permission check fix: cache Wasm OCI image permission check result Mar 1, 2025
@zhaohuabing zhaohuabing changed the title fix: cache Wasm OCI image permission check result Wasm: cache Wasm OCI image permission check result Mar 1, 2025
@zhaohuabing zhaohuabing force-pushed the fix-5326 branch 4 times, most recently from dc3e23c to baba11d Compare March 1, 2025 05:37
@zhaohuabing zhaohuabing marked this pull request as ready for review March 1, 2025 05:47
@zhaohuabing zhaohuabing changed the title Wasm: cache Wasm OCI image permission check result Wasm: cache Wasm OCI image permission check results Mar 1, 2025
@zhaohuabing zhaohuabing requested review from a team and arkodg March 1, 2025 06:01
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
@zhaohuabing zhaohuabing force-pushed the fix-5326 branch 2 times, most recently from 8386fc8 to 17702ef Compare March 1, 2025 11:39
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Insecure: insecure,
PullSecret: pullSecret,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the drawback of this, is, if permissions change on the server side, I'll have to wait 1 hr to update my wasm image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the primary issue here is - fetching a WASM Image including checking permissions blocks the route linked to this EEP as well as other routes and endpoints and other xds from being pushed to the data plane. Ideally we'd like this two tasks to be happening in parallel.
Can we fetch and check permissions in a separate go routine to unblock the translator ? If the wasm fetcher / checker go routine finds errors it could flag it in the status using watchable
Hoping the WASM URL is based off policy ns-name so data plane requests will fail in case of a deny

Copy link
Member Author

@zhaohuabing zhaohuabing Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fetch and check permissions in a separate go routine to unblock the translator ? If the wasm fetcher / checker go routine finds errors it could flag it in the status using watchable
Hoping the WASM URL is based off policy ns-name so data plane requests will fail in case of a deny

We can't fetch the OCI image async. The sha256 checksum of the wasm module is a mandatory field for xDS Wasm remote code source, and we can't get the checksum without fetching the image.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the drawback of this, is, if permissions change on the server side, I'll have to wait 1 hr to update my wasm image

Permission change actually won't invalidate cached wasm image, it only prevents the requests from unauthorized EEPs, unauthorized EEP translation will fail.

The cached images are purged after not being touched for expiry duration, which is handled in the wasm image chace.

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg arkodg requested review from a team March 7, 2025 01:34
@arkodg arkodg merged commit 672de8a into envoyproxy:main Mar 7, 2025
25 checks passed
guydc pushed a commit to guydc/gateway that referenced this pull request Mar 21, 2025
* add TTL for wasm permission check

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix test

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* change

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refresh the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* purge the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refactor

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* on retry on retriable errors

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 672de8a)
Signed-off-by: Guy Daich <guy.daich@sap.com>
zhaohuabing added a commit to zhaohuabing/gateway that referenced this pull request Mar 22, 2025
* add TTL for wasm permission check

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix test

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* change

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refresh the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* purge the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refactor

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* on retry on retriable errors

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 672de8a)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
guydc added a commit that referenced this pull request Mar 24, 2025
* load BackendTLSPolicy in standalone mode (#5431)

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
(cherry picked from commit 4d914ae)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* Wasm: cache Wasm OCI image permission check results (#5358)

* add TTL for wasm permission check

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix test

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* change

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refresh the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* purge the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refactor

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* on retry on retriable errors

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 672de8a)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* Load EnvoyExtensionPolicy in standalone mode (#5460)

* load EnvoyExtensionPolicy in standalone mode

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* more

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* release note

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* review: use a valid target name instead of myapp

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* gen

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
(cherry picked from commit 4be098d)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: check for mirror backendRef in httproute index (#5497)

* check for mirror backendRef

Signed-off-by: mark winter <mark.winter@thetradedesk.com>
(cherry picked from commit 72b72c4)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: dont return an err when gatewayclass is not accepted (#5524)

* bug: dont return an err when gatewayclass is not accepted

this is a user generated error, we shouldnt log it as
a system error, and return with an error

Signed-off-by: Arko Dasgupta <arko@tetrate.io>

* release notes

Signed-off-by: Arko Dasgupta <arko@tetrate.io>

---------

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
(cherry picked from commit 51e87ca)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: host header should not be allowed to modify (#5533)

* host header is not allowed to be modified

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* address comment

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 54efa34)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: retrigger reconciliation when backendRef of type ServiceImport is updated (#5461)

* fix: retrigger reconilation when backendRef of type ServiceImport is updated

Signed-off-by: Teju Nareddy <tejunareddy@gmail.com>
(cherry picked from commit e2f8978)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* pin envoy and ratelimit

Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: otel sink json access logging without text field (#5498)

* fix otel sink json access logging without text field

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

* use json format as default when format or type is not set

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

* set formatters only if the slice of formatters is not empty

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

---------

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>
Co-authored-by: Arko Dasgupta <arkodg@users.noreply.github.com>
(cherry picked from commit cb3ffd2)
Signed-off-by: Guy Daich <guy.daich@sap.com>

* [release/v1.3] v1.3.2 release notes (#5584)

v1.3.2 release notes

Signed-off-by: Guy Daich <guy.daich@sap.com>

---------

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Guy Daich <guy.daich@sap.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: mark winter <mark.winter@thetradedesk.com>
Signed-off-by: Teju Nareddy <tejunareddy@gmail.com>
Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>
Co-authored-by: Arko Dasgupta <arkodg@users.noreply.github.com>
Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Co-authored-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Co-authored-by: Mark Winter <wintermarkedward@gmail.com>
Co-authored-by: Teju Nareddy <tejunareddy@gmail.com>
Co-authored-by: Tomi Juntunen <tomi.juntunen@iki.fi>
arkodg added a commit that referenced this pull request Mar 25, 2025
* bump envoy to v1.32.4

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix: host header should not be allowed to modify (#5533)

* host header is not allowed to be modified

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* address comment

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 54efa34)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* bump ratelimit to 0141a24

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* Wasm: cache Wasm OCI image permission check results (#5358)

* add TTL for wasm permission check

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix test

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* change

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refresh the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* purge the cache

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* refactor

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* on retry on retriable errors

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
(cherry picked from commit 672de8a)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* load BackendTLSPolicy in standalone mode (#5431)

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
(cherry picked from commit 4d914ae)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix: check for mirror backendRef in httproute index (#5497)

* check for mirror backendRef

Signed-off-by: mark winter <mark.winter@thetradedesk.com>
(cherry picked from commit 72b72c4)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix: dont return an err when gatewayclass is not accepted (#5524)

* bug: dont return an err when gatewayclass is not accepted

this is a user generated error, we shouldnt log it as
a system error, and return with an error

Signed-off-by: Arko Dasgupta <arko@tetrate.io>

* release notes

Signed-off-by: Arko Dasgupta <arko@tetrate.io>

---------

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
(cherry picked from commit 51e87ca)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* update release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* update reatelimit

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix gen

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* Load EnvoyExtensionPolicy in standalone mode (#5460)

* load EnvoyExtensionPolicy in standalone mode

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* more

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* release note

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* review: use a valid target name instead of myapp

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

* gen

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
(cherry picked from commit 4be098d)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* add security update to release note

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* fix: otel sink json access logging without text field (#5498)

* fix otel sink json access logging without text field

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

* use json format as default when format or type is not set

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

* set formatters only if the slice of formatters is not empty

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>

---------

Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>
Co-authored-by: Arko Dasgupta <arkodg@users.noreply.github.com>
(cherry picked from commit cb3ffd2)
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* update release date

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

* update release date

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: mark winter <mark.winter@thetradedesk.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Tomi Juntunen <tomi.juntunen@iki.fi>
Co-authored-by: Arko Dasgupta <arkodg@users.noreply.github.com>
Co-authored-by: Mark Winter <wintermarkedward@gmail.com>
Co-authored-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Co-authored-by: Tomi Juntunen <tomi.juntunen@iki.fi>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Endpoints disvovery problem in kubernetes during rollout updates/restarts

4 participants