summaryrefslogtreecommitdiff
path: root/src/ldso/arm/tlsdesc.S
AgeCommit message (Collapse)AuthorLines
2019-09-29remove remaining traces of __tls_get_newSzabolcs Nagy-2/+0
Some declarations of __tls_get_new were left in the code, even though the definition got removed in commit 9d44b6460ab603487dab4d916342d9ba4467e6b9 install dynamic tls synchronously at dlopen, streamline access this can make the build fail with ld: lib/libc.so: hidden symbol `__tls_get_new' isn't defined when libc.so is linked without --gc-sections, because a .hidden declaration in asm code creates a reference even if the symbol is not actually used.
2019-09-11fix arm __tlsdesc_dynamic when built as thumb code without __ARM_ARCH>=5Rich Felker-0/+4
we don't actually support building asm source files as thumb1, but it's possible that the condition __ARM_ARCH>=5 would be false on old compilers that did not define __ARM_ARCH at all. avoiding that would require enumerating all of the possible __ARM_ARCH_*__ macros for testing. as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc is not valid for saving a return address when in thumb mode. since this code is a hot path (dynamic TLS access), don't do the out-of-line bl->bx chaining to save the return value; instead, use the fact that this file is preprocessed asm to add the missing thumb bit with an add in place of the mov. the change here does not affect builds for ISA levels new enough to have a thread pointer read instruction, or for armv5 and later as long as the compiler properly defines __ARM_ARCH, or for any build as arm (not thumb) code. it's likely that it makes no difference whatsoever to any present-day practical build environments, but nonetheless now it's safe. as an alternative, we could just assume __thumb__ implies availability of blx since we don't support building asm source files as thumb1. I didn't do that in order to avoid having a wrong assumption here if that ever changes.
2019-02-18install dynamic tls synchronously at dlopen, streamline accessRich Felker-19/+0
previously, dynamic loading of new libraries with thread-local storage allocated the storage needed for all existing threads at load-time, precluding late failure that can't be handled, but left installation in existing threads to take place lazily on first access. this imposed an additional memory access and branch on every dynamic tls access, and imposed a requirement, which was not actually met, that the dynamic tlsdesc asm functions preserve all call-clobbered registers before calling C code to to install new dynamic tls on first access. the x86[_64] versions of this code wrongly omitted saving and restoring of fpu/vector registers, assuming the compiler would not generate anything using them in the called C code. the arm and aarch64 versions saved known existing registers, but failed to be future-proof against expansion of the register file. now that we track live threads in a list, it's possible to install the new dynamic tls for each thread at dlopen time. for the most part, synchronization is not needed, because if a thread has not synchronized with completion of the dlopen, there is no way it can meaningfully request access to a slot past the end of the old dtv, which remains valid for accessing slots which already existed. however, it is necessary to ensure that, if a thread sees its new dtv pointer, it sees correct pointers in each of the slots that existed prior to the dlopen. my understanding is that, on most real-world coherency architectures including all the ones we presently support, a built-in consume order guarantees this; however, don't rely on that. instead, the SYS_membarrier syscall is used to ensure that all threads see the stores to the slots of their new dtv prior to the installation of the new dtv. if it is not supported, the same is implemented in userspace via signals, using the same mechanism as __synccall. the __tls_get_addr function, variants, and dynamic tlsdesc asm functions are all updated to remove the fallback paths for claiming new dynamic tls, and are now all branch-free.
2018-10-09fix build regression on armhf in tlsdesc asmRich Felker-0/+1
when invoking the assembler, arm gcc does not always pass the right flags to enable use of vfp instruction mnemonics. for C code it produces, it emits the .fpu directive, but this does not help when building asm source files, which tlsdesc needs to be. to fix, use an explicit directive here. commit 0beb9dfbecad38af9759b1e83eeb007e28b70abb introduced this regression. it has not appeared in any release.
2018-10-01inline cp15 thread pointer load in arm dynamic TLSDESC asm when possibleRich Felker-0/+9
the indirect function call is a significant portion of the code path for the dynamic case, and most users are probably building for ISA levels where it can be omitted. we could drop at least one register save/restore (lr) with this change, and possibly another (ip) with some clever shuffling, but it's not clear whether there's a way to do it that's not more expensive, or whether avoiding the save/restore would have any practical effect, so in the interest of avoiding complexity it's omitted for now.
2018-10-01add TLSDESC support for 32-bit armRich Felker-0/+62
unlike other asm where the baseline ISA is used, these functions are hot paths and use ISA-level specializations. call-clobbered vfp registers are saved before calling __tls_get_new, since there is no guarantee it won't use them. while setjmp/longjmp have to use hwcap to decide whether to the fpu is in use, since application code could be using vfp registers even if libc was compiled as pure softfloat, __tls_get_new is part of libc and can be assumed not to have access to vfp registers if tlsdesc.S does not. thus it suffices just to check the predefined preprocessor macros. the check for __ARM_PCS_VFP is redundant; !__SOFTFP__ must always be true if the target ISA level includes fpu instructions/registers.