summaryrefslogtreecommitdiff
path: root/src/mman
AgeCommit message (Collapse)AuthorLines
2018-09-12overhaul internally-public declarations using wrapper headersRich Felker-2/+0
commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2017-09-06work around incorrect EPERM from mmap syscallRich Felker-2/+7
under some conditions, the mmap syscall wrongly fails with EPERM instead of ENOMEM when memory is exhausted; this is probably the result of the kernel trying to fit the allocation somewhere that crosses into the kernel range or below mmap_min_addr. in any case it's a conformance bug, so work around it. for now, only handle the case of anonymous mappings with no requested address; in other cases EPERM may be a legitimate error. this indirectly fixes the possibility of malloc failing with the wrong errno value.
2017-04-21allow full-range file offsets to mmap on archs with 64-bit syscall argsRich Felker-1/+1
normally 32-bit archs use the mmap2 syscall and are limited to an offset of 2^32 pages. however some 32-bit archs (mainly ILP32-on-64 ones like x32) have 64-bit syscall argument slots and thus can accept the full range. don't artifically limit them.
2015-11-02fix mremap memory synchronization and use of variadic argumentRich Felker-4/+11
since mremap with the MREMAP_FIXED flag is an operation that unmaps existing mappings, it needs to use the vm lock mechanism to ensure that any in-progress synchronization operations using vm identities from before the call have finished. also, the variadic argument was erroneously being read even if the MREMAP_FIXED flag was not passed. in practice this didn't break anything, but it's UB and in theory LTO could turn it into a hard error.
2015-11-02prevent allocs than PTRDIFF_MAX via mremapDaniel Micay-1/+8
It's quite feasible for this to happen via MREMAP_MAYMOVE.
2015-04-10redesign and simplify vmlock systemRich Felker-15/+7
this global lock allows certain unlock-type primitives to exclude mmap/munmap operations which could change the identity of virtual addresses while references to them still exist. the original design mistakenly assumed mmap/munmap would conversely need to exclude the same operations which exclude mmap/munmap, so the vmlock was implemented as a sort of 'symmetric recursive rwlock'. this turned out to be unnecessary. commit 25d12fc0fc51f1fae0f85b4649a6463eb805aa8f already shortened the interval during which mmap/munmap held their side of the lock, but left the inappropriate lock design and some inefficiency. the new design uses a separate function, __vm_wait, which does not hold any lock itself and only waits for lock users which were already present when it was called to release the lock. this is sufficient because of the way operations that need to be excluded are sequenced: the "unlock-type" operations using the vmlock need only block mmap/munmap operations that are precipitated by (and thus sequenced after) the atomic-unlock they perform while holding the vmlock. this allows for a spectacular lack of synchronization in the __vm_wait function itself.
2015-01-30make fsync, fdatasync, and msync cancellation pointsTrutz Behn-1/+1
these are mandatory cancellation points per POSIX, so their omission was a conformance bug.
2014-09-06use weak symbols for the POSIX functions that will be used by C threadsJens Gustedt-1/+3
The intent of this is to avoid name space pollution of the C threads implementation. This has two sides to it. First we have to provide symbols that wouldn't pollute the name space for the C threads implementation. Second we have to clean up some internal uses of POSIX functions such that they don't implicitly drag in such symbols.
2014-08-16optimize locking against vm changes for mmap/munmapRich Felker-8/+7
the whole point of this locking is to prevent munmap, or mmap with MAP_FIXED, from deallocating virtual addresses, or changing the backing a given virtual address refers to, during certain race windows involving self-synchronized unmapping or destruction of pthread synchronization objects. there is no need for exclusion in the other direction, so it suffices to take the lock momentarily and release it before making the syscall, rather than holding it across the syscall.
2014-07-30add framework for mmap2 syscall unit to vary by archRich Felker-2/+3
2013-12-12include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy-2/+0
2013-09-15support configurable page size on mips, powerpc and microblazeSzabolcs Nagy-1/+1
PAGE_SIZE was hardcoded to 4096, which is historically what most systems use, but on several archs it is a kernel config parameter, user space can only know it at execution time from the aux vector. PAGE_SIZE and PAGESIZE are not defined on archs where page size is a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE) to query it. Internally libc code defines PAGE_SIZE to libc.page_size, which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink as well. (Note that libc.page_size can be accessed without GOT, ie. before relocations are done) Some fpathconf settings are hardcoded to 4096, these should be actually queried from the filesystem using statfs.
2013-07-20fix shm_open wrongly being cancellableRich Felker-1/+6
2013-06-27disallow creation of objects larger than PTRDIFF_MAX via mmapRich Felker-0/+5
internally, other parts of the library assume sizes don't overflow ssize_t and/or ptrdiff_t, and the way this assumption is made valid is by preventing creating of such large objects. malloc already does so, but the check was missing from mmap. this is also a quality of implementation issue: even if the implementation internally could handle such objects, applications could inadvertently invoke undefined behavior by subtracting pointers within an object. it is very difficult to guard against this in applications, so a good implementation should simply ensure that it does not happen.
2012-12-20clean up and fix logic for making mmap fail on invalid/unsupported offsetsRich Felker-3/+7
the previous logic was assuming the kernel would give EINVAL when passed an invalid address, but instead with MAP_FIXED it was giving EPERM, as it considered this an attempt to map over kernel memory. instead of trying to get the kernel to do the rigth thing, the new code just handles the error in userspace. I have also cleaned up the code to use a single mask to check for invalid low bits and unsupported high bits, so it's simpler and more clearly correct. the old code was actually wrong for sizeof(long) smaller than sizeof(off_t) but not equal to 4; now it should be correct for all possibilities. for 64-bit systems, the low-bits test is new and extraneous (the kernel should catch the error anyway when the mmap2 syscall is not used), but it's cheap anyway. if this is an issue, the OFF_MASK definition could be tweaked to omit the low bits when SYS_mmap2 is not defined.
2012-09-30overhaul sem_openRich Felker-3/+3
this function was overly complicated and not even obviously correct. avoid using openat/linkat just like in shm_open, and instead expand pathname using code shared with shm_open. remove bogus (and dangerous, with priorities) use of spinlocks. this commit also heavily streamlines the code and ensures there are no failure cases that can happen after a new semaphore has been created in the filesystem, since that case is unreportable.
2012-09-30clean up, bugfixes, and general improvement for shm_open/shm_unlinkRich Felker-30/+28
1. don't make non-cloexec file descriptors 2. cancellation safety (cleanup handlers were missing, now unneeded) 3. share name validation/mapping code between open/unlink functions 4. avoid wasteful/slow syscalls
2012-09-09mincore syscall wrapperRich Felker-0/+8
2011-09-27process-shared barrier support, based on discussion with bdonlanRich Felker-3/+21
this implementation is rather heavy-weight, but it's the first solution i've found that's actually correct. all waiters actually wait twice at the barrier so that they can synchronize exit, and they hold a "vm lock" that prevents changes to virtual memory mappings (and blocks pthread_barrier_destroy) until all waiters are finished inspecting the barrier. thus, it is safe for any thread to destroy and/or unmap the barrier's memory as soon as pthread_barrier_wait returns, without further synchronization.
2011-06-29work around linux bug in mprotectRich Felker-1/+5
per POSIX: The mprotect() function shall change the access protections to be that specified by prot for those whole pages containing any part of the address space of the process starting at address addr and continuing for len bytes. on the other hand, linux mprotect fails with EINVAL if the base address and/or length is not page-aligned, so we have to align them before making the syscall.
2011-04-20fix missing include in posix_madvise.c (compile error)Rich Felker-0/+1
2011-04-20support posix_madvise (previous a stub)Rich Felker-1/+3
the check against MADV_DONTNEED to because linux MADV_DONTNEED semantics conflict dangerously with the POSIX semantics
2011-04-06consistency: change all remaining syscalls to use SYS_ rather than __NR_ prefixRich Felker-1/+1
2011-03-20global cleanup to use the new syscall interfaceRich Felker-11/+11
2011-03-03implement POSIX shared memoryRich Felker-0/+42
2011-02-13cleaning up syscalls in preparation for x86_64 portRich Felker-0/+4
- hide all the legacy xxxxxx32 name cruft in syscall.h so the actual source files can be clean and uniform across all archs. - cleanup llseek/lseek and mmap2/mmap handling for 32/64 bit systems - alternate implementation for nice if the target lacks nice syscall
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker-0/+107