Age | Commit message (Collapse) | Author | Lines |
|
syscall numbers are now synced up across targets (starting from 403 the
numbers are the same on all targets other than an arch specific offset)
IPC syscalls sem*, shm*, msg* got added where they were missing (except
for semop: only semtimedop got added), the new semctl, shmctl, msgctl
imply IPC_64, see
linux commit 0d6040d4681735dfc47565de288525de405a5c99
arch: add split IPC system calls where needed
new 64bit time_t syscall variants got added on 32bit targets, see
linux commit 48166e6ea47d23984f0b481ca199250e1ce0730a
y2038: add 64-bit time_t syscalls to all 32-bit architectures
new async io syscalls got added, see
linux commit 2b188cc1bb857a9d4701ae59aa7768b5124e262e
Add io_uring IO interface
linux commit edafccee56ff31678a091ddb7219aba9b28bc3cb
io_uring: add support for pre-mapped user IO buffers
a new syscall got added that uses the fd of /proc/<pid> as a stable
handle for processes: allows sending signals without pid reuse issues,
intended to eventually replace rt_sigqueueinfo, kill, tgkill and
rt_tgsigqueueinfo, see
linux commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad
signal: add pidfd_send_signal() syscall
on some targets (arm, m68k, s390x, sh) some previously missing syscall
numbers got added as well.
|
|
it was never demonstrated to me that this workaround was needed, and
seems likely that, if there ever was any clang version for which it
was needed, it's old enough to be unusably buggy in other ways. if it
turns out some compilers actually can't do the register allocation
right, we'll need to replace this with inline shuffling code, since
the external __syscall dependency is being removed.
|
|
added in linux commit 4e21565b7fd4d9045765f697887e74a704135fe2
|
|
added in linux commit db7a2d1809a5b6b08d138ff68837f805fc073351
|
|
io_pgetevents is new in linux commit
7a074e96dee62586c935c80cecd931431bfdd0be
rseq is new in linux commit
d7822b1e24f2df5df98c76f0e94a5416349ff759
|
|
this will allow the compiler to cache and reuse the result, meaning we
no longer have to take care not to load it more than once for the sake
of archs where the load may be expensive.
depends on commit 1c84c99913bf1cd47b866ed31e665848a0da84a2 for
correctness, since otherwise the compiler could hoist loads during
stage 3 of dynamic linking before the initial thread-pointer setup.
|
|
In TLS variant I the TLS is above TP (or above a fixed offset from TP)
but on some targets there is a reserved gap above TP before TLS starts.
This matters for the local-exec tls access model when the offsets of
TLS variables from the TP are hard coded by the linker into the
executable, so the libc must compute these offsets the same way as the
linker. The tls offset of the main module has to be
alignup(GAP_ABOVE_TP, main_tls_align).
If there is no TLS in the main module then the gap can be ignored
since musl does not use it and the tls access models of shared
libraries are not affected.
The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0
(i.e. TLS did not require large alignment) because the gap was
treated as a fixed offset from TP. Now the TP points at the end
of the pthread struct (which is aligned) and there is a gap above
it (which may also need alignment).
The fix required changing TP_ADJ and __pthread_self on affected
targets (aarch64, arm and sh) and in the tlsdesc asm the offset to
access the dtv changed too.
|
|
PAGESIZE is actually the version defined in POSIX base, with PAGE_SIZE
being in the XSI option. use PAGESIZE as the underlying definition to
facilitate making exposure of PAGE_SIZE conditional.
|
|
statx was added in linux commit a528d35e8bfcc521d7cb70aaf03e1bd296c8493f
(there is no libc wrapper yet and microblaze and sh misses the number).
|
|
most of the found naming differences don't matter to musl, because
internally it unifies the syscall names that vary across targets,
but for external code the names should match the kernel uapi.
aarch64:
__NR_fstatat is called __NR_newfstatat in linux.
__NR_or1k_atomic got mistakenly copied from or1k.
arm:
__NR_arm_sync_file_range is an alias for __NR_sync_file_range2
__NR_fadvise64_64 is called __NR_arm_fadvise64_64 in linux,
the old non-arm name is kept too, it should not cause issues.
(powerpc has similar nonstandard fadvise and it uses the
normal name.)
i386:
__NR_madvise1 was removed from linux in commit
303395ac3bf3e2cb488435537d416bc840438fcb 2011-11-11
microblaze:
__NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite
had different name in linux.
mips:
__NR_fadvise, __NR_fstatat, __NR_pread, __NR_pwrite, __NR_select
had different name in linux.
mipsn32:
__NR_fstatat is called __NR_newfstatat in linux.
or1k:
__NR__llseek is called __NR_llseek in linux.
the old name is kept too because that's the name musl uses
internally.
powerpc:
__NR_{get,set}res{gid,uid}32 was never present in powerpc linux.
__NR_timerfd was briefly defined in linux but then got renamed.
|
|
see linux commit e8c24d3a23a469f1f40d4de24d872ca7023ced0a
and linux Documentation/x86/protection-keys.txt
|
|
It's identical to the generic version, after evaluating the endian
preprocessor checks in the generic version.
|
|
the syscalls take an additional flag argument, they were added in commit
f17d8b35452cab31a70d224964cd583fb2845449 and a RWF_HIPRI priority hint
flag was added to linux/fs.h in 97be7ebe53915af504fb491fb99f064c7cf3cb09.
the syscall is not allocated for microblaze and sh yet.
|
|
|
|
it was introduced for offloading copying between regular files
in linux commit 29732938a6289a15e907da234d6692a2ead71855
(microblaze and sh does not yet have the syscall number.)
|
|
currently five targets use the same mman.h constants and the rest
share most constants too, so move them to sys/mman.h before the
bits/mman.h include where the differences can be corrected by
redefinition of the macros.
this fixes two minor bugs: POSIX_MADV_DONTNEED was wrong on most
targets (it should be the same as MADV_DONTNEED), and sh defined
the x86-only MAP_32BIT mmap flag.
|
|
all bits headers that were identical for a number of 'clean' archs are
moved to the new arch/generic tree. in addition, a few headers that
differed only cosmetically from the new generic version are removed.
additional deduplication may be possible in mman.h and in several
headers (limits.h, posix.h, stdint.h) that mostly depend on whether
the arch is 32- or 64-bit, but they are left alone for now because
greater gains are likely possible with more invasive changes to header
logic, which is beyond the scope of this commit.
|
|
they lock faulted pages into memory (useful when a small part of a
large mapped file needs efficient access), new in linux v4.4, commit
b0f205c2a3082dd9081f9a94e50658c5fa906ff1
MLOCK_* is not in the POSIX reserved namespace for sys/mman.h
|
|
this is mlock with a flags argument, new in linux commit
a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e
as usual microblaze and sh don't have allocated syscall number yet.
|
|
new in linux v4.3 added for aarch64, arm, i386, mips, or1k, powerpc,
x32 and x86_64.
membarrier is a system wide memory barrier, moves most of the
synchronization cost to one side, new in kernel commit
5b25b13ab08f616efd566347d809b4ece54570d1
userfaultfd is useful for qemu and is new in kernel commit
8d2afd96c20316d112e04d935d9e09150e988397
switch_endian is powerpc only for switching endianness, new in commit
529d235a0e190ded1d21ccc80a73e625ebcad09b
|
|
rather than having each arch provide its own atomic.h, there is a new
shared atomic.h in src/internal which pulls arch-specific definitions
from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal,
defining only a_cas or new ll/sc type primitives which the shared
atomic.h will use to construct everything else.
this commit avoids making heavy changes to the individual archs'
atomic implementations. definitions which are identical or
near-identical to what the new shared atomic.h would produce have been
removed, but otherwise the changes made are just hooking up the
arch-specific files to the new infrastructure. major changes to take
advantage of the new system will come in subsequent commits.
|
|
using the actual mcontext_t definition rather than an overlaid pointer
array both improves correctness/readability and eliminates some ugly
hacks for archs with 64-bit registers bit 32-bit program counter.
also fix UB due to comparison of pointers not in a common array
object.
|
|
other archs use asm for the thread pointer load, so making that asm
volatile is sufficient to inform the compiler that it has a "side
effect" (crashing or giving the wrong result if the thread pointer was
not yet initialized) that prevents reordering. however, powerpc and
or1k have dedicated general purpose registers for the thread pointer
and did not need to use any asm to access it; instead, "local register
variables with a specified register" were used. however, there is no
specification for ordering constraints on this type of usage, and
presumably use of the thread pointer could be reordered across its
initialization.
to impose an ordering, I have added empty volatile asm blocks that
produce the "local register variable with a specified register" as
an output constraint.
|
|
|
|
i386 and x86_64 versions already had the .text directive; other archs
did not. normally, top-level (file scope) __asm__ starts in the .text
section anyway, but problems were reported with some versions of
clang, and it seems preferable to set it explicitly anyway, at least
for the sake of consistency between archs.
|
|
remove __syscall declaration where it is not needed (aarch64, arm,
microblaze, or1k) and add the hidden attribute where it is (mips).
|
|
this overhaul further reduces the amount of arch-specific code needed
by the dynamic linker and removes a number of assumptions, including:
- that symbolic function references inside libc are bound at link time
via the linker option -Bsymbolic-functions.
- that libc functions used by the dynamic linker do not require
access to data symbols.
- that static/internal function calls and data accesses can be made
without performing any relocations, or that arch-specific startup
code handled any such relocations needed.
removing these assumptions paves the way for allowing libc.so itself
to be built with stack protector (among other things), and is achieved
by a three-stage bootstrap process:
1. relative relocations are processed with a flat function.
2. symbolic relocations are processed with no external calls/data.
3. main program and dependency libs are processed with a
fully-functional libc/ldso.
reduction in arch-specific code is achived through the following:
- crt_arch.h, used for generating crt1.o, now provides the entry point
for the dynamic linker too.
- asm is no longer responsible for skipping the beginning of argv[]
when ldso is invoked as a command.
- the functionality previously provided by __reloc_self for heavily
GOT-dependent RISC archs is now the arch-agnostic stage-1.
- arch-specific relocation type codes are mapped directly as macros
rather than via an inline translation function/switch statement.
|
|
while it's the same for all presently supported archs, it differs at
least on sparc, and conceptually it's no less arch-specific than the
other O_* macros. O_SEARCH and O_EXEC are still defined in terms of
O_PATH in the main fcntl.h.
|
|
the previous values (2k min and 8k default) were too small for some
archs. aarch64 reserves 4k in the signal context for future extensions
and requires about 4.5k total, and powerpc reportedly uses over 2k.
the new minimums are chosen to fit the saved context and also allow a
minimal signal handler to run.
since the default (SIGSTKSZ) has always been 6k larger than the
minimum, it is also increased to maintain the 6k usable by the signal
handler. this happens to be able to store one pathname buffer and
should be sufficient for calling any function in libc that doesn't
involve conversion between floating point and decimal representations.
x86 (both 32-bit and 64-bit variants) may also need a larger minimum
(around 2.5k) in the future to support avx-512, but the values on
these archs are left alone for now pending further analysis.
the value for PTHREAD_STACK_MIN is not increased to match MINSIGSTKSZ
at this time. this is so as not to preclude applications from using
extremely small thread stacks when they know they will not be handling
signals. unfortunately cancellation and multi-threaded set*id() use
signals as an implementation detail and therefore require a stack
large enough for a signal context, so applications which use extremely
small thread stacks may still need to avoid using these features.
|
|
Implemented as a wrapper around fegetround introducing a new function
to the ABI: __flt_rounds. (fegetround cannot be used directly from float.h)
|
|
these macros have the same distinct definition on blackfin, frv, m68k,
mips, sparc and xtensa kernels. POLLMSG and POLLRDHUP additionally
differ on sparc.
|
|
the memory model we use internally for atomics permits plain loads of
values which may be subject to concurrent modification without
requiring that a special load function be used. since a compiler is
free to make transformations that alter the number of loads or the way
in which loads are performed, the compiler is theoretically free to
break this usage. the most obvious concern is with atomic cas
constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
multiple loads of *p whose resulting values might fail to be equal;
this would break the atomicity of the whole operation. but even more
fundamental breakage is possible.
with the changes being made now, objects that may be modified by
atomics are modeled as volatile, and the atomic operations performed
on them by other threads are modeled as asynchronous stores by
hardware which happens to be acting on the request of another thread.
such modeling of course does not itself address memory synchronization
between cores/cpus, but that aspect was already handled. this all
seems less than ideal, but it's the best we can do without mandating a
C11 compiler and using the C11 model for atomics.
in the case of pthread_once_t, the ABI type of the underlying object
is not volatile-qualified. so we are assuming that accessing the
object through a volatile-qualified lvalue via casts yields volatile
access semantics. the language of the C standard is somewhat unclear
on this matter, but this is an assumption the linux kernel also makes,
and seems to be the correct interpretation of the standard.
|
|
this syscall allows fexecve to be implemented without /proc, it is new
in linux v3.19, added in commit 51f39a1f0cea1cacf8c787f652f26dfee9611874
(sh and microblaze do not have allocated syscall numbers yet)
added a x32 fix as well: the io_setup and io_submit syscalls are no
longer common with x86_64, so use the x32 specific numbers.
|
|
the definitions are generic for all kernel archs. exposure of these
macros now only occurs on the same feature test as for the function
accepting them, which is believed to be more correct.
|
|
these syscalls are new in linux v3.18, bpf is present on all
supported archs except sh, kexec_file_load is only allocted for
x86_64 and x32 yet.
bpf was added in linux commit 99c55f7d47c0dc6fc64729f37bf435abf43f4c60
kexec_file_load syscall number was allocated in commit
f0895685c7fd8c938c91a9d8a6f7c11f22df58d2
|
|
|
|
except powerpc, which still lacks inline syscalls simply because
nobody has written the code, these are all fallbacks used to work
around a clang bug that probably does not exist in versions of clang
that can compile musl. however, it's useful to have the generic
non-inline code anyway, as it eases the task of porting to new archs:
writing inline syscall code is now optional. this approach could also
help support compilers which don't understand inline asm or lack
support for the needed register constraints.
mips could not be unified because it has special fixup code for broken
layout of the kernel's struct stat.
|
|
the kernel syscall interface for or1k does not expect 64-bit arguments
to be aligned to "even" register boundaries. this incorrect alignment
broke truncate/ftruncate and as well as a few less-common syscalls.
|
|
|
|
these syscalls are new in linux v3.17 and present on all supported
archs except sh.
seccomp was added in commit 48dc92b9fc3926844257316e75ba11eb5c742b2c
it has operation, flags and pointer arguments (if flags==0 then it is
the same as prctl(PR_SET_SECCOMP,...)), the uapi header for flag
definitions is linux/seccomp.h
getrandom was added in commit c6e9d6f38894798696f23c8084ca7edbf16ee895
it provides an entropy source when open("/dev/urandom",..) would fail,
the uapi header for flags is linux/random.h
memfd_create was added in commit 9183df25fe7b194563db3fec6dc3202a5855839c
it allows anon mmap to have an fd, that can be shared, sealed and needs no
mount point, the uapi header for flags is linux/memfd.h
|
|
based on patch by Jens Gustedt.
mtx_t and cnd_t are defined in such a way that they are formally
"compatible types" with pthread_mutex_t and pthread_cond_t,
respectively, when accessed from a different translation unit. this
makes it possible to implement the C11 functions using the pthread
functions (which will dereference them with the pthread types) without
having to use the same types, which would necessitate either namespace
violations (exposing pthread type names in threads.h) or incompatible
changes to the C++ name mangling ABI for the pthread types.
for the rest of the types, things are much simpler; using identical
types is possible without any namespace considerations.
|
|
conceptually, a_spin needs to be at least a compiler barrier, so the
compiler will not optimize out loops (and the load on each iteration)
while spinning. it should also be a memory barrier, or the spinning
thread might keep spinning without noticing stores from other threads,
thus delaying for longer than it should.
ideally, an optimal a_spin implementation that avoids unnecessary
cache/memory contention should be chosen for each arch, but for now,
the easiest thing is to perform a useless a_cas on the calling
thread's stack.
|
|
unfortunately this needs to be able to vary by arch, because of a huge
mess GCC made: the GCC definition, which became the ABI, depends on
quirks in GCC's definition of __alignof__, which does not match the
formal alignment of the type.
GCC's __alignof__ unexpectedly exposes the an implementation detail,
its "preferred alignment" for the type, rather than the formal/ABI
alignment of the type, which it only actually uses in structures. on
most archs the two values are the same, but on some (at least i386)
the preferred alignment is greater than the ABI alignment.
I considered using _Alignas(8) unconditionally, but on at least one
arch (or1k), the alignment of max_align_t with GCC's definition is
only 4 (even the "preferred alignment" for these types is only 4).
|
|
when manipulating the robust list, the order of stores matters,
because the code may be asynchronously interrupted by a fatal signal
and the kernel will then access the robust list in what is essentially
an async-signal context.
previously, aliasing considerations made it seem unlikely that a
compiler could reorder the stores, but proving that they could not be
reordered incorrectly would have been extremely difficult. instead
I've opted to make all the pointers used as part of the robust list,
including those in the robust list head and in the individual mutexes,
volatile.
in addition, the format of the robust list has been changed to point
back to the head at the end, rather than ending with a null pointer.
this is to match the documented kernel robust list ABI. the null
pointer, which was previously used, only worked because faults during
access terminate the robust list processing.
|
|
for or1k, the kernel expects the offset passed to mmap2 in units of
the 8k page size, not the standard unit of 4k used on most other
archs.
|
|
according to Stefan Kristiansson, or1k page size is not actually
variable and the value of 8192 is part of the ABI.
|
|
this follows the same logic as in the previous commit for other archs.
|
|
it's like rename but with flags eg. to allow atomic exchange of two files,
introduced in linux 3.15 commit 520c8b16505236fc82daa352e6c5e73cd9870cff
|
|
at the very least, a compiler barrier is required no matter what, and
that was missing. current or1k implementations have strong ordering,
but this is not guaranteed as part of the ISA, so some sort of
synchronizing operation is necessary.
in principle we should use l.msync, but due to misinterpretation of
the spec, it was wrongly treated as an optional instruction and is not
supported by some implementations. if future kernels trap it and treat
it as a nop (rather than illegal instruction) when the
hardware/emulator does not support it, we could consider using it.
in the absence of l.msync support, the l.lwa/l.swa instructions, which
are specified to have a built-in l.msync, need to be used. the easiest
way to use them to implement atomic store is to perform an atomic swap
and throw away the result. using compare-and-swap would be lighter,
and would probably be sufficient for all actual usage cases, but
checking this is difficult and error-prone:
with store implemented in terms of swap, it's guaranteed that, when
another atomic operation is performed at the same time as the store,
either the result of the store followed by the other operation, or
just the store (clobbering the other operation's result) is seen. if
store were implemented in terms of cas, there are cases where this
invariant would fail to hold, and we would need detailed rules for the
situations in which the store operation is well-defined.
|
|
With the exception of a fenv implementation, the port is fully featured.
The port has been tested in or1ksim, the golden reference functional
simulator for OpenRISC 1000.
It passes all libc-test tests (except the math tests that
requires a fenv implementation).
The port assumes an or1k implementation that has support for
atomic instructions (l.lwa/l.swa).
Although it passes all the libc-test tests, the port is still
in an experimental state, and has yet experienced very little
'real-world' use.
|