musl - musl - an implementation of the standard library for Linux-based systems

Age	Commit message (Collapse)	Author	Lines
2020-05-21	handle possibility that SIGEMT replaces SIGSTKFLT in strsignal	Rich Felker	-0/+10
	presently all archs define SIGSTKFLT but this is not correct. change strsignal as a prerequisite for fixing that.
2020-04-30	fix undefined behavior from signed overflow in strstr and memmem	Rich Felker	-8/+8
	unsigned char promotes to int, which can overflow when shifted left by 24 bits or more. this has been reported multiple times but then forgotten. it's expected to be benign UB, but can trap when built with explicit overflow catching (ubsan or similar). fix it now. note that promotion to uint32_t is safe and portable even outside of the assumptions usually made in musl, since either uint32_t has rank at least unsigned int, so that no further default promotions happen, or int is wide enough that the shift can't overflow. this is a desirable property to have in case someone wants to reuse the code elsewhere.
2020-03-20	remove redundant condition in memccpy	Alexander Monakov	-1/+1
	Commit d9bdfd164 ("fix memccpy to not access buffer past given size") correctly added a check for 'n' nonzero, but made the pre-existing test 's==c' redundant: n!=0 implies s==c. Remove the unnecessary check. Reported by Alexey Izbyshev.
2020-01-16	add thumb2 support to arm assembler memcpy	Andre McCurdy	-6/+9
	For Thumb2 compatibility, replace two instances of a single instruction "orr with a variable shift" with the two instruction equivalent. Neither of the replacements are in a performance critical loop.
2018-12-02	fix memccpy to not access buffer past given size	Quentin Rameau	-1/+1
	memccpy would return a pointer over the given size when c is not found in the source buffer and n reaches 0.
2018-11-08	optimize two-way strstr and memmem bad character shift	Rich Felker	-2/+2
	first, the condition (mem && k < p) is redundant, because mem being nonzero implies the needle is periodic with period exactly p, in which case any byte that appears in the needle must appear in the last p bytes of the needle, bounding the shift (k) by p. second, the whole point of replacing the shift k by mem (=l-p) is to prevent shifting by less than mem when discarding the memory on shift, in which case linear time could not be guaranteed. but as written, the check also replaced shifts greater than mem by mem, reducing the benefit of the shift. there is no possible benefit to this reduction of the shift; since mem is being cleared, the full shift is valid and more optimal. so only replace the shift by mem when it would be less than mem.
2018-11-02	remove commented-out debug printf from strstr	Rich Felker	-1/+0
	this was leftover from before the initial commit.
2018-11-02	fix spuriously slow check in twoway strstr/memmem cores	Rich Felker	-2/+2
	mem0 && mem && ... is redundant since mem can only be nonzero when mem0 is nonzero.
2018-09-26	fix aliasing-based undefined behavior in string functions	Rich Felker	-19/+46
	use the GNU C may_alias attribute if available, and fallback to naive byte-by-byte loops if __GNUC__ is not defined. this patch has been written to minimize changes so that history remains reviewable; it does not attempt to bring the affected code into a more consistent or elegant form.
2018-09-23	optimize nop case of wmemmove	Rich Felker	-0/+1

2018-09-23	fix undefined pointer comparison in wmemmove	Rich Felker	-1/+2

2018-09-23	fix undefined pointer comparison in memmove	Rich Felker	-1/+1
	the comparison must take place in the address space model as an integer type, since comparing pointers that are not pointing into the same array is undefined. the subsequent d<s comparison however is valid, because it's only reached in the case where the source and dest overlap, in which case they are necessarily pointing to parts of the same array. to make the comparison, use an unsigned range check for dist(s,d)>=n, algebraically !(-n<s-d<n). subtracting n yields !(-2n<s-d-n<0), which mapped into unsigned modular arithmetic is !(-2n<s-d-n) or rather -2*n>=s-d-n.
2018-09-12	reduce spurious inclusion of libc.h	Rich Felker	-9/+0
	libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2018-09-12	remove or make static various unused __-prefixed symbols	Rich Felker	-4/+1

2018-09-12	overhaul internally-public declarations using wrapper headers	Rich Felker	-10/+0
	commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2018-09-12	remove unused code from strcpy.c	Rich Felker	-7/+0

2018-07-02	optimize explicit_bzero for size	Alexander Monakov	-1/+1
	Avoid saving/restoring the incoming argument by reusing memset return value.
2018-06-26	add explicit_bzero implementation	David Carlier	-0/+8
	maintainer's note: past sentiment was that, despite being imperfect and unable to force clearing of all possible copies of sensitive data (e.g. in registers, register spills, signal contexts left on the stack, etc.) this function would be added if major implementations agreed on it, which has happened -- several BSDs and glibc all include it.
2017-09-04	fix OOB reads in Xbyte_memmem	Alexander Monakov	-9/+9
	Reported by Leah Neukirchen.
2017-08-29	fix undefined behavior in memset due to missing sequence points	Rich Felker	-4/+8
	patch by Pascal Cuoq.
2017-06-22	fix arm run-time abi string functions	Szabolcs Nagy	-36/+76
	in arm rtabi these __aeabi_* functions have special abi (they are only allowed to clobber r0,r1,r2,r3,ip,lr,cpsr), so they cannot be simple wrappers around normal string functions (which may clobber other registers), the safest solution is to write them in asm, a minimalistic implementation works because these are not supposed to be emitted by compilers or used in general.
2016-12-17	disable use of arm memcpy asm if building as thumb code	Rich Felker	-2/+2
	the thumb incompatibilities in the asm are probably only minor and should be fixable, but for now just use the C version.
2016-04-01	fix read past end of haystack buffer for short needles in memmem	Rich Felker	-0/+1
	the two/three/four byte memmem specializations are not prepared to handle haystacks shorter than the needle; they unconditionally read at least up to the needle length and subtract from the haystack length. if the haystack is shorter, the remaining haystack length underflows and produces an unbounded search which will eventually either crash or find a spurious match. the top-level memmem function attempted to avoid this case already by checking for haystack shorter than needle, but it failed to re-check after using memchr to remove the maximal prefix not containing the first byte of the needle.
2016-01-22	move arm-specific translation units out of arch/arm/src, to src/*/arm	Rich Felker	-0/+36
	this is possible with the new build system that allows src//$(ARCH)/ files which do not shadow a file in the parent directory, and yields a more logical organization. eventually it will be possible to remove arch/*/src from the build system.
2016-01-20	adapt build of arm memcpy asm not to use .sub files	Rich Felker	-2/+7
	this depends on commit 9f5eb77992b42d484d69e879d24ef86466f20f21, which made it possible to use a .c file for arch-specific replacements, and on commit 2f853dd6b9a95d5b13ee8f9df762125e0588df5d, the out-of-tree build support, which made it so that src//$(ARCH)/ 'replacement' files get used even if they don't match the base name of a .c file in the parent directory.
2015-11-09	remove non-working pre-armv4t support from arm asm	Rich Felker	-4/+0
	the idea of the three-instruction sequence being removed was to be able to return to thumb code when used on armv4t+ from a thumb caller, but also to be able to run on armv4 without the bx instruction available (in which case the low bit of lr would always be 0). however, without compiler support for generating such a sequence from C code, which does not exist and which there is unlikely to be interest in implementing, there is little point in having it in the asm, and it would likely be easier to add pre-armv4t support via enhanced linker handling of R_ARM_V4BX than at the compiler level. removing this code simplifies adding support for building libc in thumb2-only form (for cortex-m).
2015-11-05	convert arm memcpy asm to UAL, remove .word hacks	Rich Felker	-22/+24
	contrary to commit 9367fe926196f407705bb07cd29c6e40eb1774dd, all relevant gas versions actually do support .syntax unified.
2015-06-23	reimplement strverscmp to fix corner cases	Rich Felker	-32/+25
	this interface is non-standardized and is a GNU invention, and as such, our implementation should match the behavior of the GNU function. one peculiarity the old implementation got wrong was the handling of all-zero digit sequences: they are supposed to compare greater than digit sequences of which they are a proper prefix, as in 009 < 00. in addition, high bytes were treated with char signedness rather than as unsigned. this was wrong regardless of what the GNU function does since the resulting order relation varied by arch. the new strverscmp implementation makes explicit the cases where the order differs from what strcmp would produce, of which there are only two.
2015-04-18	remove potentially PIC-incompatible relocations from x86_64 and x32 asm	Rich Felker	-1/+5
	analogous to commit 8ed66ecbcba1dd0f899f22b534aac92a282f42d5 for i386.
2015-04-18	remove the last of possible-textrels from i386 asm	Rich Felker	-1/+5
	none of these are actual textrels because of ld-time binding performed by -Bsymbolic-functions, but I'm changing them with the goal of making ld-time binding purely an optimization rather than relying on it for semantic purposes. in the case of memmove's call to memcpy, making it explicit that the memmove asm is assuming the forward-copying behavior of the memcpy asm is desirable anyway; in case memcpy is ever changed, the semantic mismatch would be apparent while editing memmcpy.s.
2015-02-26	overhaul optimized x86_64 memset asm	Rich Felker	-26/+55
	on most cpu models, "rep stosq" has high overhead that makes it undesirable for small memset sizes. the new code extends the minimal-branch fast path for short memsets from size 15 up to size 126, and shrink-wraps this code path. in addition, "rep stosq" is sensitive to misalignment. the cost varies with size and with cpu model, but it has been observed performing 1.5 times slower when the destination address is not aligned mod 16. the new code thus ensures alignment mod 16, but also preserves any existing additional alignment, in case there are cpu models where it is beneficial. this version is based in part on changes proposed by Denys Vlasenko.
2015-02-26	overhaul optimized i386 memset asm	Rich Felker	-32/+61
	on most cpu models, "rep stosl" has high overhead that makes it undesirable for small memset sizes. the new code extends the minimal-branch fast path for short memsets from size 15 up to size 62, and shrink-wraps this code path. in addition, "rep stosl" is very sensitive to misalignment. the cost varies with size and with cpu model, but it has been observed performing 1.5 to 4 times slower when the destination address is not aligned mod 16. the new code thus ensures alignment mod 16, but also preserves any existing additional alignment, in case there are cpu models where it is beneficial. this version is based in part on changes to the x86_64 memset asm proposed by Denys Vlasenko.
2015-02-10	x86_64/memset: avoid performing final store twice	Denys Vlasenko	-1/+1
	The code does a potentially misaligned 8-byte store to fill the tail of the buffer. Then it fills the initial part of the buffer which is a multiple of 8 bytes. Therefore, if size is divisible by 8, we were storing last word twice. This patch decrements byte count before dividing it by 8, making one less store in "size is divisible by 8" case, and not changing anything in all other cases. All at the cost of replacing one MOV insn with LEA insn. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2015-02-10	x86_64/memset: simple optimizations	Denys Vlasenko	-14/+16
	"and $0xff,%esi" is a six-byte insn (81 e6 ff 00 00 00), can use 4-byte "movzbl %sil,%esi" (40 0f b6 f6) instead. 64-bit imul is slow, move it as far up as possible so that the result (rax) has more time to be ready by the time we start using it in mem stores. There is no need to shuffle registers in preparation to "rep movs" if we are not going to take that code path. Thus, patch moves "jump if len < 16" instructions up, and changes alternate code path to use rdx and rdi instead of rcx and r8. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2014-11-23	fix tabs/spaces in memcpy.s	Rich Felker	-279/+279
	this file had been a mess that went unnoticed ever since it was imported. some lines used spaces for indention while others used tabs, and tabs were used for alignment.
2014-11-23	fix build regression in arm asm for memcpy	Rich Felker	-30/+30
	commit 27828f7e9adb6b4f93ca56f6f98ef4c44bb5ed4e fixed compatibility with clang's internal assembler, but broke compatibility with gas and the traditional arm asm syntax by switching to the arm "unified assembler language" (UAL). recent versions of gas also support UAL, but require the .syntax directive to be used to switch to it. clang on the other hand defaults to UAL. and old versions of gas (still relevant) don't support UAL at all. for the conditional ldm/stm instructions, "ia" is default and can just be omitted, resulting in a mnemonic that's compatible with both traditional and UAL syntax. but for byte/halfword loads and stores, there seems to be no mnemonic compatible with both, and thus .word is used to produce the desired opcode explicitly. the .inst directive is not used because it is not compatible with older assemblers.
2014-11-23	arm assembly changes for clang compatibility	Joakim Sindholt	-30/+30

2014-10-04	fix handling of odd lengths in swab function	Rich Felker	-1/+1
	this function is specified to leave the last byte with "unspecified disposition" when the length is odd, so for the most part correct programs should not be calling swab with odd lengths. however, doing so is permitted, and should not write past the end of the destination buffer.
2014-07-26	add support for LC_TIME and LC_MESSAGES translations	Rich Felker	-2/+3
	for LC_MESSAGES, translation of strerror and similar literal message functions is supported. for messages in other places (particularly the dynamic linker) that use format strings, translation is not yet supported. in order to make it possible and safe, such messages will need to be refactored to separate the textual content from the format. for LC_TIME, the day and month names and strftime-style format strings provided by nl_langinfo are supported for translation. however there may be limitations, as some of the original C-locale nl_langinfo strings are non-unique and thus perhaps non-suitable as keys. overall, the locale support activated by this commit should not be seen as complete and polished but as a basis for beginning to test locale functionality and implement locales.
2014-07-02	consolidate str[n]casecmp_l into str[n]casecmp source files	Rich Felker	-0/+16
	this is mainly done for consistency with the ctype functions and to declutter the src/locale directory.
2014-06-19	fix incorrect comparison loop condition in memmem	Rich Felker	-2/+2
	the logic for this loop was copied from null-terminated-string logic in strstr without properly adapting it to work with explicit lengths. presumably this error could result in false negatives (wrongly comparing past the end of the needle/haystack), false positives (stopping comparison early when the needle contains null bytes), and crashes (from runaway reads past the end of mapped memory).
2014-04-18	fix false negatives with periodic needles in strstr, wcsstr, and memmem	Rich Felker	-3/+3
	in cases where the memorized match range from the right factor exceeded the length of the left factor, it was wrongly treated as a mismatch rather than a match. issue reported by Yves Bastide.
2014-04-09	fix search past the end of haystack in memmem	Timo Teräs	-0/+1
	to optimize the search, memchr is used to find the first occurrence of the first character of the needle in the haystack before switching to a search for the full needle. however, the number of characters skipped by this first step were not subtracted from the haystack length, causing memmem to search past the end of the haystack.
2013-12-12	include cleanups: remove unused headers and add feature test macros	Szabolcs Nagy	-18/+7

2013-11-23	strcmp: Remove unnecessary check for *r	Michael Forney	-1/+1
	If l == r && l, then by transitivity, r.
2013-08-28	optimized C memcpy	Rich Felker	-16/+111
	unlike the old C memcpy, this version handles word-at-a-time reads and writes even for misaligned copies. it does not require that the cpu support misaligned accesses; instead, it performs bit shifts to realign the bytes for the destination. essentially, this is the C version of the ARM assembly language memcpy. the ideas are all the same, and it should perform well on any arch with a decent number of general-purpose registers that has a barrel shift operation. since the barrel shifter is an optional cpu feature on microblaze, it may be desirable to provide an alternate asm implementation on microblaze, but otherwise the C code provides a competitive implementation for "generic risc-y" cpu archs that should alleviate the urgent need for arch-specific memcpy asm.
2013-08-27	optimized C memset	Rich Felker	-12/+77
	this version of memset is optimized both for small and large values of n, and makes no misaligned writes, so it is usable (and near-optimal) on all archs. it is capable of filling up to 52 or 56 bytes without entering a loop and with at most 7 branches, all of which can be fully predicted if memset is called multiple times with the same size. it also uses the attribute extension to inform the compiler that it is violating the aliasing rules, unlike the previous code which simply assumed it was safe to violate the aliasing rules since translation unit boundaries hide the violations from the compiler. for non-GNUC compilers, 100% portable fallback code in the form of a naive loop is provided. I intend to eventually apply this approach to all of the string/memory functions which are doing word-at-a-time accesses.
2013-08-14	add arm-optimized memcpy implementation from bionic libc	Rich Felker	-0/+383
	the approach of this implementation was heavily investigated prior to adopting it. attempts to obtain similar performance with pure C code were capping out at about 75% of the performance of the asm, with considerably larger code size, and were fragile in that the compiler would sometimes compile part of memcpy into a call to itself. therefore, just using the asm seems to be the best option. this commit is the first to make use of the new subarch-specific asm framework. the new armel directory is the location for arm asm that should not be used for all arm subarchs, only the default one. armhf is the name of the little-endian hardfloat-ABI subarch, which can use the exact same asm. in both cases, the build system finds the asm by following a memcpy.sub file. the other two subarchs, armeb and armebhf, would need a big-endian variant of this code. it would not be hard to adapt the code to big endian, but I will hold off on doing so until there is demand for it.
2013-08-01	optimized memset asm for i386 and x86_64	Rich Felker	-0/+88
	the concept of both versions is the same; they differ only in details. for long runs, they use "rep movsl" or "rep movsq", and for small runs, they use a trick, writing from both ends towards the middle, that reduces the number of branches needed. in addition, if memset is called multiple times with the same length, all branches will be predicted; there are no loops. for larger runs, there are likely faster approaches than "rep", at least on some cpu models. for 32-bit, it's unlikely that there is any faster approach that does not require non-baseline instructions; doing anything fancier would require inspecting cpu capabilities. for 64-bit, there may very well be faster versions that work on all models; further optimization could be explored in the future. with these changes, memset is anywhere between 50% faster and 6 times faster, depending on the cpu model and the length and alignment of the destination buffer.
2013-07-09	fix a couple misleading/wrong signal descriptions in strsignal	Rich Felker	-2/+2
	there are still several more that are misleading, but SIGFPE (integer division error misdescribed as floating point) and and SIGCHLD (possibly non-exit status change events described as exiting) were the worst offenders.