Details:
- Previously, the block_ptrs field of the pool_t was allowed to be
initialized as any unsigned integer, including 0. However, a length of
0 could be problematic given that malloc(0) is undefined and therefore
variable across implementations. As a safety measure, we check for
block_ptrs array lengths of 0 and, in that case, increase them to 1.
- Co-authored-by: Minh Quan Ho <minh-quan.ho@kalray.eu>
Details:
- The current mechanism for growing a pool_t doubles the length of the
block_ptrs array every time the array length needs to be increased
due to new blocks being added. However, that logic did not take in
account the new total number of blocks, and the fact that the caller
may be requesting more blocks that would fit even after doubling the
current length of block_ptrs. The code comments now contain two
illustrating examples that show why, even after doubling, we must
always have at least enough room to fit all of the old blocks plus
the newly requested blocks.
- This commit also happens to fix a memory corruption issue that stems
from growing any pool_t that is initialized with a block_ptrs length
of 0. (Previously, the memory pool for packed buffers of C was
initialized with a block_ptrs length of 0, but because it is unused
this bug did not manifest by default.)
- Co-authored-by: Minh Quan Ho <minh-quan.ho@kalray.eu>
This test will run on Linux, but all the kernels should run just fine. This does not test autodetection but then none of the other ARM tests do either.
Previously, this was a global variable. Setting the value was synchronized via a mutex but reading the value was not. Of course, these accesses are almost certainly atomic, but there is still the possibility of one thread attempting to set the value and then reading the value set by another thread. For correct operation under user threading (e.g. pthreads), this should probably be thread-local with no mutex.
Adds `--enable-rpath/--disable--rpath` (default disabled) to use an install_name starting with @rpath/. Otherwise, set the install_name to the absolute path of the install library, which was the previous behavior.
Details:
- Reworked support for ARM hardware detection in bli_cpuid.c to parse
the result of a CPUID-like instruction.
- Added a64fx support to bli_gks.c.
- #include arm64 and arm32 family headers from bli_arch_config.h.
- Fix the ordering of the "armsve" and "a64fx" strings in the
config_name string array in bli_arch.c. The ordering did not match
the ordering of the corresponding arch_t values in bli_type_defs.h,
as it should have all along.
- Added clang support to make_defs.mk in arm64, cortexa53, cortexa57
subconfigs.
- Updated arm64 and arm32 families in config_registry.
- Updated docs/HardwareSupport.md to reflect added ARM support.
- Thanks to Dave Love, RuQing Xu, and Devin Matthews for their
contributions in this PR (#344).
- RPATH entries (and DYLD_LIBRARY_PATH) do nothing on macOS unless the install_name of the library starts with @rpath/. While the install_name can be set to the absolute install path, this makes the installation non-relocatable. When using @path in the install_name, install paths within the normal DYLD_LIBRARY_PATH work with no changes on the user side, but for install paths off the beaten track, users must specify an RPATH entry when linking (or modify DYLD_LIBRARY_PATH at runtime). Perhaps this could be made into a configure-time option.
- Having relocable testsuite binaries is not necessarily a priority but it is easy to do with @executable_path (macOS) or $ORIGIN (linux/BSD).