-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sysvabi64] Add chapter on Thread Local Storage #311
base: main
Are you sure you want to change the base?
Conversation
The thread local storage chapter contains: * A description of Thread Local Storage based on addenda32 * The key design decisions of AArch64 TLS such as tls variant, tls dialect, TCB size. * The ABI required code sequence for TLSDESC that must be emitted exactly, as GNU ld requires it to be. * Sequences for the different code-models. * Relaxations for GD->IE, GD->LE and IE->LE. * Synchronization requirements for Lazy TLSDESC. With advice not to support it due to overhead of synchronization.
sysvabi64/sysvabi64.rst
Outdated
and ``PT_TLS`` as the program header with type PT_TLS. ``PAD`` must be | ||
the smallest positive integer that satisfies the following congruence: | ||
|
||
``TP + TCB + PAD ≡ PT_TLS.p_vaddr (modulo PT_TLS.p_align)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TP+TCB+PAD
on the left could be confusing, as TCB is placed before TP. Perhaps mention the requirement of TP first (= 0 (modulo p_align)), then describe PAD
and this formula.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can word it better. I've found it difficult to try and explain the formula intuitively.
sysvabi64/sysvabi64.rst
Outdated
Given that ``TP ≡ 0 (modulo PT_TLS.p_align)``. An expression | ||
for `PAD` is ``PAD = (PT_TLS.p_vaddr - TCB) mod PT_TLS.p_align``. | ||
|
||
A significant number of dynamic linkers use a different calculation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that glibc Variant I handles p_vaddr!=0 (mod p_align)
correctly. The bug (https://sourceware.org/bugzilla/show_bug.cgi?id=24606
) is for Variant II (x86 etc).
I have fixed FreeBSD rtld's Variant II in https://reviews.freebsd.org/D31538 . Its Variant I may or may not have the bug.
musl has been good since 1.1.23
Therefore, it's probably not "a significant number" but yeah p_vaddr=0 (mod p_align)
is good for maximum compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it difficult to be confident on the status of the various dynamic linkers. I can remove the significant part.
The glibc bug looks good for static TLS, it does mention in https://sourceware.org/bugzilla/show_bug.cgi?id=24606#c7 that dynamic TLS still needs p_vaddr to be 0 (modulo p_align).
add xn, tp, :tprel_hi12:var, lsl #12 // R_AARCH64_TLSLE_ADD_TPREL_HI12 var | ||
ldr xn, [xn, #:tprel_lo12_nc:var] // R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC var | ||
|
||
Static link time TLS Relaxations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps call this Optimization to be consistent with x86/ppc and "Relocation optimization" (ADRP) and leave the term "relocation relaxation" for RISC-V style section shrinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For TLS specifically I'd prefer to keep relaxation as that's what its been referred to in all the previous literature such as Drepper's ELF Handling for Thread Local Storage and the TLSDESC paper too. It should help people searching in the references.
I take the point that it ought to have been called optimization. I'll add a sentence to say that we're using relaxation as a term from the existing literature.
|
||
Undefined Weak Symbols | ||
|
||
An undefined weak symbol has the value 0. As the resolver function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be the glibc behavior, but musl doesn't have the special __dl_tlsdesc_undefweak
. I think it's better to allow flexibility and require a particular behavior on undefined weak TLS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is just an example of what can be done. I've written at the top of the section
The TLS resolver functions are not standardized by this ABI as they are internal to the dynamic linker
and
These examples are for illustrative purposes only
I'll see if there's anything I can do to state that there is no requirement to implement a specific resolver function.
* Edits to split up the bullet points in How to denote TLS in source. * Changed program-own state to process-state as the thread-id may not be stored separately from the programs data. * Removed typically from some of the descriptions as the typically will almost always be the case for a sysvabi platform. * Linked alignment padding to the definition. * Provided a bit more information about generation counters.
* Rearranged formulas and used TCBsize to make it clearer. * Taken out "significant" from a significant number of dynamic linkers. * Give reason for using relaxation rather than optimization. * Clarify that there is no requirement to implement any TLSDESC resolver given in the sysvabi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the review.
I've updated based on this and some comments I received internally.
sysvabi64/sysvabi64.rst
Outdated
Given that ``TP ≡ 0 (modulo PT_TLS.p_align)``. An expression | ||
for `PAD` is ``PAD = (PT_TLS.p_vaddr - TCB) mod PT_TLS.p_align``. | ||
|
||
A significant number of dynamic linkers use a different calculation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it difficult to be confident on the status of the various dynamic linkers. I can remove the significant part.
The glibc bug looks good for static TLS, it does mention in https://sourceware.org/bugzilla/show_bug.cgi?id=24606#c7 that dynamic TLS still needs p_vaddr to be 0 (modulo p_align).
add xn, tp, :tprel_hi12:var, lsl #12 // R_AARCH64_TLSLE_ADD_TPREL_HI12 var | ||
ldr xn, [xn, #:tprel_lo12_nc:var] // R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC var | ||
|
||
Static link time TLS Relaxations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For TLS specifically I'd prefer to keep relaxation as that's what its been referred to in all the previous literature such as Drepper's ELF Handling for Thread Local Storage and the TLSDESC paper too. It should help people searching in the references.
I take the point that it ought to have been called optimization. I'll add a sentence to say that we're using relaxation as a term from the existing literature.
|
||
Undefined Weak Symbols | ||
|
||
An undefined weak symbol has the value 0. As the resolver function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is just an example of what can be done. I've written at the top of the section
The TLS resolver functions are not standardized by this ABI as they are internal to the dynamic linker
and
These examples are for illustrative purposes only
I'll see if there's anything I can do to state that there is no requirement to implement a specific resolver function.
sysvabi64/sysvabi64.rst
Outdated
and ``PT_TLS`` as the program header with type PT_TLS. ``PAD`` must be | ||
the smallest positive integer that satisfies the following congruence: | ||
|
||
``TP + TCB + PAD ≡ PT_TLS.p_vaddr (modulo PT_TLS.p_align)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can word it better. I've found it difficult to try and explain the formula intuitively.
The thread local storage chapter contains: