mirror of
https://github.com/amd/blis.git
synced 2026-05-24 18:34:40 +00:00
* commit '81e10346': Alloc at least 1 elem in pool_t block_ptrs. (#560) Fix insufficient pool-growing logic in bli_pool.c. (#559) Arm SVE C/ZGEMM Fix FMOV 0 Mistake SH Kernel Unused Eigher Arm SVE C/ZGEMM Support *beta==0 Arm SVE Config armsve Use ZGEMM/CGEMM Arm SVE: Update Perf. Graph Arm SVE CGEMM 2Vx10 Unindex Process Alpha=1.0 Arm SVE ZGEMM 2Vx10 Unindex Process Alpha=1.0 A64FX Config Use ZGEMM/CGEMM Arm SVE Typo Fix ZGEMM/CGEMM C Prefetch Reg Arm SVE Add SGEMM 2Vx10 Unindexed Arm SVE ZGEMM Support Gather Load / Scatt. St. Arm SVE Add ZGEMM 2Vx10 Unindexed Arm SVE Add ZGEMM 2Vx7 Unindexed Arm SVE Add ZGEMM 2Vx8 Unindexed Update Travis CI badge Armv8 Trash New Bulk Kernels Enable testing 1m in `make check`. Config ArmSVE Unregister 12xk. Move 12xk to Old Revert __has_include(). Distinguish w/ BLIS_FAMILY_** Register firestorm into arm64 Metaconfig Armv8 DGEMMSUP Fix Edge 6x4 Switch Case Typo Armv8 DGEMMSUP Fix 8x4m Store Inst. Typo Add test for Apple M1 (firestorm) Firestorm CPUID Dispatcher Armv8 GEMMSUP Edge Cases Require Signed Ints Make error checking level a thread-local variable. Fix data race in testsuite. Update .appveyor.yml Firestorm Block Size Fixes Armv8 Handle *beta == 0 for GEMMSUP ??r Case. Move unused ARM SVE kernels to "old" directory. Add an option to control whether or not to use @rpath. Fix $ORIGIN usage on linux. Arm micro-architecture dispatch (#344) Use @path-based install name on MacOS and use relocatable RPATH entries for testsuite inaries. Armv8 Handle *beta == 0 for GEMMSUP ?rc Case. Armv8 Fix 6x8 Row-Maj Ukr Apply patch from @xrq-phys. Add explicit handling for beta == 0 in armsve sd and armv7a d gemm ukrs. bli_error: more cleanup on the error strings array Arm SVE Exclude SVE-Intrinsic Kernels for GCC 8-9 Arm SVE: Correct PACKM Ker Name: Intrinsic Kers Fix config_name in bli_arch.c Arm Whole GEMMSUP Call Route is Asm/Int Optimized Arm: DGEMMSUP `Macro' Edge Cases Stop Calling Ref Header Typo Arm: DGEMMSUP ??r(rv) Invoke Edge Size Arm: DGEMMSUP ?rc(rd) Invoke Edge Size Arm: Implement GEMMSUP Fallback Method Arm64 Fix: Support Alpha/Beta in GEMMSUP Intrin Added Apple Firestorm (A14/M1) Subconfig Arm64 8x4 Kernel Use Less Regs Armv8-A Supplimentary GEMMSUP Sizes for RD Armv8-A Fix GEMMSUP-RD Kernels on GNU Asm Armv8-A Adjust Types for PACKM Kernels Armv8-A GEMMSUP-RD 6x8m Armv8-A GEMMSUP-RD 6x8n Armv8-A s/d Packing Kernels Fix Typo Armv8-A Introduced s/d Packing Kernels Armv8-A DGEMMSUP 6x8m Kernel Armv8-A DGEMMSUP Adjustments Armv8-A Add More DGEMMSUP Armv8-A Add GEMMSUP 4x8n Kernel Armv8-A Add Part of GEMMSUP 8x4m Kernel Armv8A DGEMM 4x4 Kernel WIP. Slow Armv8-A Add 8x4 Kernel WIP AMD-Internal: [CPUPL-2698] Change-Id: I194ff69356740bb36ca189fd1bf9fef02eec3803
219 lines
8.4 KiB
Makefile
219 lines
8.4 KiB
Makefile
#
|
|
#
|
|
# BLIS
|
|
# An object-based framework for developing high-performance BLAS-like
|
|
# libraries.
|
|
#
|
|
# Copyright (C) 2014, The University of Texas at Austin
|
|
# Copyright (C) 2022, Advanced Micro Devices, Inc. All rights reserved.
|
|
#
|
|
# Redistribution and use in source and binary forms, with or without
|
|
# modification, are permitted provided that the following conditions are
|
|
# met:
|
|
# - Redistributions of source code must retain the above copyright
|
|
# notice, this list of conditions and the following disclaimer.
|
|
# - Redistributions in binary form must reproduce the above copyright
|
|
# notice, this list of conditions and the following disclaimer in the
|
|
# documentation and/or other materials provided with the distribution.
|
|
# - Neither the name(s) of the copyright holder(s) nor the names of its
|
|
# contributors may be used to endorse or promote products derived
|
|
# from this software without specific prior written permission.
|
|
#
|
|
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
|
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
|
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
|
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
|
# HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
|
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
#
|
|
#
|
|
|
|
# Only include this block of code once
|
|
ifndef CONFIG_MK_INCLUDED
|
|
CONFIG_MK_INCLUDED := yes
|
|
|
|
# The version string. This could be the official string or a custom
|
|
# string forced at configure-time.
|
|
VERSION := @version@
|
|
|
|
# The shared library .so major and minor.build version numbers.
|
|
SO_MAJOR := @so_version_major@
|
|
SO_MINORB := @so_version_minorbuild@
|
|
SO_MMB := $(SO_MAJOR).$(SO_MINORB)
|
|
|
|
# The name of the configuration family.
|
|
CONFIG_NAME := @config_name@
|
|
|
|
# The list of sub-configurations associated with CONFIG_NAME. Each
|
|
# sub-configuration in CONFIG_LIST corresponds to a configuration
|
|
# sub-directory in the 'config' directory. See the 'config_registry'
|
|
# file for the full list of registered configurations.
|
|
CONFIG_LIST := @config_list@
|
|
|
|
# This list of kernels needed for the configurations in CONFIG_LIST.
|
|
# Each item in this list corresponds to a sub-directory in the top-level
|
|
# 'kernels' directory. Oftentimes, this list is identical to CONFIG_LIST,
|
|
# but not always. For example, if configuration X and Y use the same
|
|
# kernel set X, and configuration W uses kernel set Q, and the CONFIG_LIST
|
|
# might contained "X Y Z W", then the KERNEL_LIST would contain "X Z Q".
|
|
KERNEL_LIST := @kernel_list@
|
|
|
|
# This list contains some number of "kernel:config" pairs, where "config"
|
|
# specifies which configuration's compilation flags (CFLAGS) should be
|
|
# used to compile the source code for the kernel set named "kernel".
|
|
KCONFIG_MAP := @kconfig_map@
|
|
|
|
# The operating system name, which should be either 'Linux' or 'Darwin'.
|
|
OS_NAME := @os_name@
|
|
|
|
# Check for whether the operating system is Windows.
|
|
IS_WIN := @is_win@
|
|
|
|
# The directory path to the top level of the source distribution. When
|
|
# building in-tree, this path is ".". When building out-of-tree, this path
|
|
# is path used to identify the location of configure. We also allow the
|
|
# includer of config.mk to override this value by setting DIST_PATH prior
|
|
# to including this file. This override option is employed, for example,
|
|
# when common.mk (and therefore config.mk) is included by the Makefile
|
|
# local to the 'testsuite' directory, or the 'test' directory containing
|
|
# individual test drivers.
|
|
ifeq ($(strip $(DIST_PATH)),)
|
|
DIST_PATH := @dist_path@
|
|
endif
|
|
|
|
# The C compiler.
|
|
CC_VENDOR := @CC_VENDOR@
|
|
CC := @CC@
|
|
|
|
# Important C compiler ranges.
|
|
GCC_OT_4_9_0 := @gcc_older_than_4_9_0@
|
|
GCC_OT_6_1_0 := @gcc_older_than_6_1_0@
|
|
GCC_OT_9_1_0 := @gcc_older_than_9_1_0@
|
|
GCC_OT_11_2_0 := @gcc_older_than_11_2_0@
|
|
|
|
# The C++ compiler. NOTE: A C++ is typically not needed.
|
|
CXX := @CXX@
|
|
|
|
# Static library indexer.
|
|
RANLIB := @RANLIB@
|
|
|
|
# Archiver.
|
|
AR := @AR@
|
|
|
|
# Python Interpreter
|
|
PYTHON := @PYTHON@
|
|
|
|
# Preset (required) CFLAGS and LDFLAGS. These variables capture the value
|
|
# of the CFLAGS and LDFLAGS environment variables at configure-time (and/or
|
|
# the value of CFLAGS/LDFLAGS if either was specified on the command line).
|
|
# These flags are used in addition to the flags automatically determined
|
|
# by the build system.
|
|
CFLAGS_PRESET := @cflags_preset@
|
|
LDFLAGS_PRESET := @ldflags_preset@
|
|
|
|
# The level of debugging info to generate.
|
|
DEBUG_TYPE := @debug_type@
|
|
|
|
# Whether operating system support was requested via --enable-system.
|
|
ENABLE_SYSTEM := @enable_system@
|
|
|
|
# The requested threading model.
|
|
THREADING_MODEL := @threading_model@
|
|
|
|
# Whether the compiler supports "#pragma omp simd" via the -fopenmp-simd option.
|
|
PRAGMA_OMP_SIMD := @pragma_omp_simd@
|
|
|
|
# The installation prefix, exec_prefix, libdir, includedir, and shareddir
|
|
# values from configure tell us where to install the libraries, header files,
|
|
# and public makefile fragments. We must first assign each substituted
|
|
# @anchor@ to its own variable. Why? Because the subsitutions may contain
|
|
# unevaluated variable expressions. For example, '@libdir@' may be replaced
|
|
# with '${exec_prefix}/lib'. By assigning the anchors to variables first, and
|
|
# then assigning them to their final INSTALL_* variables, we allow prefix and
|
|
# exec_prefix to be used in the definitions of exec_prefix, libdir,
|
|
# includedir, and sharedir.
|
|
prefix := @prefix@
|
|
exec_prefix := @exec_prefix@
|
|
libdir := @libdir@
|
|
includedir := @includedir@
|
|
sharedir := @sharedir@
|
|
|
|
# Notice that we support the use of DESTDIR so that advanced users may install
|
|
# to a temporary location.
|
|
INSTALL_LIBDIR := $(DESTDIR)$(libdir)
|
|
INSTALL_INCDIR := $(DESTDIR)$(includedir)
|
|
INSTALL_SHAREDIR := $(DESTDIR)$(sharedir)
|
|
|
|
#$(info prefix = $(prefix) )
|
|
#$(info exec_prefix = $(exec_prefix) )
|
|
#$(info libdir = $(libdir) )
|
|
#$(info includedir = $(includedir) )
|
|
#$(info sharedir = $(sharedir) )
|
|
#$(error .)
|
|
|
|
# Whether to output verbose command-line feedback as the Makefile is
|
|
# processed.
|
|
ENABLE_VERBOSE := @enable_verbose@
|
|
|
|
# Whether we are building out-of-tree.
|
|
BUILDING_OOT := @configured_oot@
|
|
|
|
# Whether we need to employ an alternate method for passing object files to
|
|
# ar and/or the linker to work around a small value of ARG_MAX.
|
|
ARG_MAX_HACK := @enable_arg_max_hack@
|
|
|
|
# Whether to build the static and shared libraries.
|
|
# NOTE: The "MK_" prefix, which helps differentiate these variables from
|
|
# their corresonding cpp macros that use the BLIS_ prefix.
|
|
MK_ENABLE_STATIC := @enable_static@
|
|
MK_ENABLE_SHARED := @enable_shared@
|
|
|
|
# Whether to use an install_name based on @rpath.
|
|
MK_ENABLE_RPATH := @enable_rpath@
|
|
|
|
# Whether to export all symbols within the shared library, even those symbols
|
|
# that are considered to be for internal use only.
|
|
EXPORT_SHARED := @export_shared@
|
|
|
|
# Whether to enable either the BLAS or CBLAS compatibility layers.
|
|
MK_ENABLE_BLAS := @enable_blas@
|
|
MK_ENABLE_CBLAS := @enable_cblas@
|
|
|
|
# Whether libblis will depend on libmemkind for certain memory allocations.
|
|
MK_ENABLE_MEMKIND := @enable_memkind@
|
|
|
|
# The names of the addons to include when building BLIS. If empty, no addons
|
|
# will be included.
|
|
ADDON_LIST := @addon_list@
|
|
|
|
# The name of a sandbox defining an alternative gemm implementation. If empty,
|
|
# no sandbox will be used and the conventional gemm implementation will remain
|
|
# enabled.
|
|
SANDBOX := @sandbox@
|
|
|
|
# The name of the pthreads library. If --disable-system was given, then this
|
|
# variable is set to the empty value.
|
|
LIBPTHREAD := @libpthread@
|
|
|
|
# Complex return scheme configuration
|
|
MK_ENABLE_TRSM_PREINVERSION := @enable_trsm_preinversion@
|
|
|
|
# Complex return scheme configuration
|
|
MK_COMPLEX_RETURN_SCHEME := @complex_return@
|
|
|
|
# Status of aocl dynamic configuration
|
|
MK_ENABLE_AOCL_DYNAMIC := @enable_aocl_dynamic@
|
|
|
|
# BLAS int size
|
|
MK_BLAS_INT_TYPE_SIZE := @blas_int_type_size@
|
|
|
|
MK_IS_ARCH_ZEN := @enable_aocl_zen@
|
|
|
|
# end of ifndef CONFIG_MK_INCLUDED conditional block
|
|
endif
|