summaryrefslogtreecommitdiff
path: root/src/thread/i386/tls.s
AgeCommit message (Collapse)AuthorLines
2019-02-18install dynamic tls synchronously at dlopen, streamline accessRich Felker-8/+0
previously, dynamic loading of new libraries with thread-local storage allocated the storage needed for all existing threads at load-time, precluding late failure that can't be handled, but left installation in existing threads to take place lazily on first access. this imposed an additional memory access and branch on every dynamic tls access, and imposed a requirement, which was not actually met, that the dynamic tlsdesc asm functions preserve all call-clobbered registers before calling C code to to install new dynamic tls on first access. the x86[_64] versions of this code wrongly omitted saving and restoring of fpu/vector registers, assuming the compiler would not generate anything using them in the called C code. the arm and aarch64 versions saved known existing registers, but failed to be future-proof against expansion of the register file. now that we track live threads in a list, it's possible to install the new dynamic tls for each thread at dlopen time. for the most part, synchronization is not needed, because if a thread has not synchronized with completion of the dlopen, there is no way it can meaningfully request access to a slot past the end of the old dtv, which remains valid for accessing slots which already existed. however, it is necessary to ensure that, if a thread sees its new dtv pointer, it sees correct pointers in each of the slots that existed prior to the dlopen. my understanding is that, on most real-world coherency architectures including all the ones we presently support, a built-in consume order guarantees this; however, don't rely on that. instead, the SYS_membarrier syscall is used to ensure that all threads see the stores to the slots of their new dtv prior to the installation of the new dtv. if it is not supported, the same is implemented in userspace via signals, using the same mechanism as __synccall. the __tls_get_addr function, variants, and dynamic tlsdesc asm functions are all updated to remove the fallback paths for claiming new dynamic tls, and are now all branch-free.
2015-04-14use hidden __tls_get_new for tls/tlsdesc lookup fallback casesRich Felker-1/+3
previously, the dynamic tlsdesc lookup functions and the i386 special-ABI ___tls_get_addr (3 underscores) function called __tls_get_addr when the slot they wanted was not already setup; __tls_get_addr would then in turn also see that it's not setup and call __tls_get_new. calling __tls_get_new directly is both more efficient and avoids the issue of calling a non-hidden (public API/ABI) function from asm. for the special i386 function, a weak reference to __tls_get_new is used since this function is not defined when static linking (the code path that needs it is unreachable in static-linked programs).
2014-06-19optimize i386 ___tls_get_addr asmRich Felker-1/+8
2012-10-04beginnings of full TLS support in shared librariesRich Felker-0/+8
this code will not work yet because the necessary relocations are not supported, and cannot be supported without some internal changes to how relocation processing works (coming soon).