Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement and use a once mechanism #8869

Open
tom-cosgrove-arm opened this issue Feb 28, 2024 · 2 comments
Open

Implement and use a once mechanism #8869

tom-cosgrove-arm opened this issue Feb 28, 2024 · 2 comments
Assignees

Comments

@tom-cosgrove-arm
Copy link
Contributor

tom-cosgrove-arm commented Feb 28, 2024

There are a some places in the code where statics are initialised to -1 and this is used as a sentinel value to trigger the determination of feature availability.

For example

library/aes.c: static int aes_padlock_ace = -1;
library/aesce.c: signed char mbedtls_aesce_has_support_result = -1;

The standard thread-safe way to do this is with a once mechanism, e.g. pthread_once() or InitOnceExecuteOnce().

This issue is to create an Mbed TLS execute-once wrapper, and use that to set these globals.

@solardiz
Copy link
Contributor

solardiz commented Dec 11, 2024

I notice there's a related thread-safety (non-?)issue in mbedtls_aesni_has_support(). Edit: moved it to #9840.

@solardiz
Copy link
Contributor

solardiz commented Dec 11, 2024

The below experiment speeds things up a tiny bit by skipping a call to a function provided by aesni.c (the added inline keyword is needed as for me these changes otherwise caused this patched function not to be inlined anymore, which negated the speedup from the avoided call to the other function):

+++ b/src/mbedtls/aes.c
@@ -532,10 +532,15 @@ void mbedtls_aes_xts_free(mbedtls_aes_xts_context *ctx)
 #define MAY_NEED_TO_ALIGN
 #endif
 
-MBEDTLS_MAYBE_UNUSED static unsigned mbedtls_aes_rk_offset(uint32_t *buf)
+MBEDTLS_MAYBE_UNUSED static inline unsigned mbedtls_aes_rk_offset(uint32_t *buf)
 {
 #if defined(MAY_NEED_TO_ALIGN)
-    int align_16_bytes = 0;
+    static int align_16_bytes = 0;
+
+    if (align_16_bytes > 0)
+        goto align_16_bytes;
+    if (align_16_bytes < 0)
+        return 0;
 
 #if defined(MBEDTLS_VIA_PADLOCK_HAVE_CODE)
     if (aes_padlock_ace == -1) {
@@ -553,6 +558,7 @@ MBEDTLS_MAYBE_UNUSED static unsigned mbedtls_aes_rk_offset(uint32_t *buf)
 #endif
 
     if (align_16_bytes) {
+align_16_bytes: ;
         /* These implementations needs 16-byte alignment
          * for the round key array. */
         unsigned delta = ((uintptr_t) buf & 0x0000000fU) / 4;
@@ -561,6 +567,8 @@ MBEDTLS_MAYBE_UNUSED static unsigned mbedtls_aes_rk_offset(uint32_t *buf)
         } else {
             return 4 - delta; // 16 bytes = 4 uint32_t
         }
+    } else {
+        align_16_bytes = -1;
     }
 #else /* MAY_NEED_TO_ALIGN */
     (void) buf;

Just sharing to illustrate another place where "use once" would be useful, but also only if it's just as fast as the above - which is unlikely when doing things properly?

I think the above dirty hack is fine across archs since it only assumes reads/writes to int are atomic, but makes no assumption about when the writes propagate to reads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants