Age | Commit message (Collapse) | Author | Lines |
|
the memory model we use internally for atomics permits plain loads of
values which may be subject to concurrent modification without
requiring that a special load function be used. since a compiler is
free to make transformations that alter the number of loads or the way
in which loads are performed, the compiler is theoretically free to
break this usage. the most obvious concern is with atomic cas
constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
multiple loads of *p whose resulting values might fail to be equal;
this would break the atomicity of the whole operation. but even more
fundamental breakage is possible.
with the changes being made now, objects that may be modified by
atomics are modeled as volatile, and the atomic operations performed
on them by other threads are modeled as asynchronous stores by
hardware which happens to be acting on the request of another thread.
such modeling of course does not itself address memory synchronization
between cores/cpus, but that aspect was already handled. this all
seems less than ideal, but it's the best we can do without mandating a
C11 compiler and using the C11 model for atomics.
in the case of pthread_once_t, the ABI type of the underlying object
is not volatile-qualified. so we are assuming that accessing the
object through a volatile-qualified lvalue via casts yields volatile
access semantics. the language of the C standard is somewhat unclear
on this matter, but this is an assumption the linux kernel also makes,
and seems to be the correct interpretation of the standard.
|
|
|
|
previously, the __timedwait function was optionally a cancellation
point depending on whether it was passed a pointer to a cleaup
function and context to register. as of now, only one caller actually
used such a cleanup function (and it may face removal soon); most
callers either passed a null pointer to disable cancellation or a
dummy cleanup function.
now, __timedwait is never a cancellation point, and __timedwait_cp is
the cancellable version. this makes the intent of the calling code
more obvious and avoids ugly dummy functions and long argument lists.
|
|
a_store is only valid for int, but ssize_t may be defined as long or
another type. since there is no valid way for another thread to acess
the return value without first checking the error/completion status of
the aiocb anyway, an atomic store is not necessary.
|
|
previously, aio operations were not tracked by file descriptor; each
operation was completely independent. this resulted in non-conforming
behavior for non-seekable/append-mode writes (which are required to be
ordered) and made it impossible to implement aio_cancel, which in turn
made closing file descriptors with outstanding aio operations unsafe.
the new implementation is significantly heavier (roughly twice the
size, and seems to be slightly slower) and presently aims mainly at
correctness, not performance.
most of the public interfaces have been moved into a single file,
aio.c, because there is little benefit to be had from splitting them.
whenever any aio functions are used, aio_cancel and the internal
queue lifetime management and fd-to-queue mapping code must be linked,
and these functions make up the bulk of the code size.
the close function's interaction with aio is implemented with weak
alias magic, to avoid pulling in heavy aio cancellation code in
programs that don't use aio, and the expensive cancellation path
(which includes signal blocking) is optimized out when there are no
active aio queues.
|
|
versionsort64, aio*64 and lio*64 symbols were missing, they are
only needed for glibc ABI compatibility, on the source level
dirent.h and aio.h already redirect them.
|
|
the main motivation for this change is to remove the assumption that
the tid of the main thread is also the pid of the process. (the value
returned by the set_tid_address syscall was used to fill both fields
despite it semantically being the tid.) this is historically and
presently true on linux and unlikely to change, but it conceivably
could be false on other systems that otherwise reproduce the linux
syscall api/abi.
only a few parts of the code were actually still using the cached pid.
in a couple places (aio and synccall) it was a minor optimization to
avoid a syscall. caching could be reintroduced, but lazily as part of
the public getpid function rather than at program startup, if it's
deemed important for performance later. in other places (cancellation
and pthread_kill) the pid was completely unnecessary; the tkill
syscall can be used instead of tgkill. this is actually a rather
subtle issue, since tgkill is supposedly a solution to race conditions
that can affect use of tkill. however, as documented in the commit
message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill
does not actually solve this race; it just limits it to happening
within one process rather than between processes. we use a lock that
avoids the race in pthread_kill, and the use in the cancellation
signal handler is self-targeted and thus not subject to tid reuse
races, so both are safe regardless of which syscall (tgkill or tkill)
is used.
|
|
PAGE_SIZE was hardcoded to 4096, which is historically what most
systems use, but on several archs it is a kernel config parameter,
user space can only know it at execution time from the aux vector.
PAGE_SIZE and PAGESIZE are not defined on archs where page size is
a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE)
to query it. Internally libc code defines PAGE_SIZE to libc.page_size,
which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink
as well. (Note that libc.page_size can be accessed without GOT, ie.
before relocations are done)
Some fpathconf settings are hardcoded to 4096, these should be actually
queried from the filesystem using statfs.
|
|
issue found and patch provided by Jens Gustedt. after the atomic store
to the error code field of the aiocb, the application is permitted to
free or reuse the storage, so further access is invalid. instead, use
the local copy that was already made.
|
|
|
|
|
|
for some reason I have not been able to determine, gcc 3.2 rejects the
array notation. this seems to be a gcc bug, but since it's easy to
work around, let's do the workaround and avoid gratuitously requiring
newer compilers.
|
|
this mirrors the stdio_impl.h cleanup. one header which is not
strictly needed, errno.h, is left in pthread_impl.h, because since
pthread functions return their error codes rather than using errno,
nearly every single pthread function needs the errno constants.
in a few places, rather than bringing in string.h to use memset, the
memset was replaced by direct assignment. this seems to generate much
better code anyway, and makes many functions which were previously
non-leaf functions into leaf functions (possibly eliminating a great
deal of bloat on some platforms where non-leaf functions require ugly
prologue and/or epilogue).
|
|
to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.
|
|
|
|
i blame this one on posix for using hideous const-qualified double
pointers which are unusable without hideous casts.
|
|
|
|
previous fix was backwards and propagated the wrong type rather than
the right one...
|
|
|
|
some features are not yet supported, and only minimal testing has been
performed. should be considered experimental at this point.
|