- The current build systems have the following behaviour
with regards to building "aocl_gemm" addon codebase(LPGEMM)
when giving "amdzen" as the target architecture(fat-binary)
- Make: Attempts to compile LPGEMM kernels using the same
compiler flags that the makefile fragments set for BLIS
kernels, based on the compiler version.
- CMake: With presets, it always enables the addon compilation
unless explicitly specified with the ENABLE_ADDON variable.
- This poses a bug with older compilers, owing to them not supporting
BF16 or INT8 intrinsic compilation.
- This patch adds the functionality to check for GCC and Clang compiler versions,
and disables LPGEMM compilation if GCC < 11.2 or Clang < 12.0.
- Make: Updated the configure script to check for the compiler version
if the addon is specified.
CMake: Updated the main CMakeLists.txt to check for the compiler version
if the addon is specified, and to also force-update the associated
cache variable update. Also updated kernels/CMakeLists.txt to
check if "aocl_gemm" remains in the ENABLE_ADDONS list after
all the checks in the previous layers.
AMD-Internal: [CPUPL-7850]
Signed-off by : Vignesh Balasubramanian <Vignesh.Balasubramanian@amd.com>
* commit 'db3134ed6d239a7962a2b7470d8c46611b9d17ef':
Disabled no post-ops path in lpgemm f32 kernels for few gcc versions
DTL Log update
Add external PR integration process and flowchart to CONTRIBUTING.md
Enabled disable-sba-pools feature in AOCL-BLAS (#101)
Fix for F32 to BF16 Conversion and AVX512 ISA Support Checks
Fixed Integer Overflow Issue in TPSV
Add AI Code Review workflow (#211)
Add AI Code Review Self-enablement file (#209)
Re-tuned GEMV thresholds (#210)
Adding bli_print_msg before bli_abort() for bli_thrinfo_sup_create_for_cntl
Add missing license text
Modified AVPY kernel to ensure consistency of numerical results (#188)
Fix memory leak in DGEMV kernel (#187)
Tuned DGEMV no-transpose thresholds #193
Set Security flags default enable (#194)
Standardize Zen kernel names (2)
Compiler warnings fixes (2)
coverity issue fix for ztrsm (#176)
Fixes Coverity static analysis issue in the DTRSM (#181)
Add files via upload (#197)
Initialize mem_t structures safely and handle NULL communicator in threading
Tidying code
Compiler warnings fixes
Fixing the coverity issues with CID: 23269 and CID: 137049 (#180)
Fixed high priority coverity issues in LPGEMM. (#178)
GCC 15 SUP kernel workaround (2)
Disable small_gemm for zen4/5 and added single thread check for tiny path (#167)
Optimal rerouting of GEMV inputs to avoid packing
Updated Guards in s8s8s32of32_sym_quant Framework
Fixed out-of-bound access in F32 matrix add/mul ops (#168)
Some files have copyright statements but not details of the license.
Add this to DTL source code and some build and benchmark related
scripts.
AMD-Internal: [CPUPL-6579]
Instead of editing a header file, add options to build systems to allow
DTL tracing and/or logging output to be generated. For most users
logging is recommended, producing a line of output per application
thread of every BLAS call made. Tracing provides more detailed info
of internal BLIS calls, and is aimed more at expert users and BLIS
developers. Different tracing levels from 1 to 10 provide control of
the granularity of information produced. The default level is 5. Note
that tracing, especially at higher tracing levels, will impose a
significant runtime cost overhead.
Example usage:
Using configure:
./configure ... --enable-aocl-dtl=log amdzen
./configure ... --enable-aocl-dtl=trace --aocl-dtl-trace-level=6 amdzen
./configure ... --enable-aocl-dtl=all amdzen
Using CMake:
cmake ... -DENABLE_AOCL_DTL=LOG
cmake ... -DENABLE_AOCL_DTL=TRACE -DAOCL_DTL_TRACE_LEVEL=6
cmake ... -DENABLE_AOCL_DTL=ALL
Also, modify function AOCL_get_requested_threads_count to correct
reported thread count in cases where internal value is recorded as -1
AMD-Internal: [CPUPL-7010]
Add macros to allow specific code options to be enabled or disabled,
controlled by options to configure and cmake. This expands on the
existing GEMM and/or TRSM functionality to enable/disable SUP handling
and replaces the hard coded #define in include files to enable small matrix
paths.
All options are enabled by default for all BLIS sub-configs but many of them
are currently only implemented in AMD specific framework code variants.
AMD-Internal: [CPUPL-6906]
---------
Co-authored-by: Varaganti, Kiran <Kiran.Varaganti@amd.com>
Rename generated aocl-blas.pc and aocl-blas-mt.pc to blis.pc and blis-mt.pc.
AMD-Internal: [SWLCSG-3446]
Change-Id: Ica784c7a0fd1e52b4d419795659947316e932ef6
Added separate package configuration file for
st and mt library in blis Makefile and CMakeLists.txt
Change-Id: I8d851fac10d63983358e1f4c67fd9451246056bf
- Standardize formatting (spacing etc).
- Add full copyright to cmake files (excluding .json)
- Correct copyright and disclaimer text for frame and
zen, skx and a couple of other kernels to cover all
contributors, as is commonly used in other files.
- Fixed some typos and missing lines in copyright
statements.
AMD-Internal: [CPUPL-4415]
Change-Id: Ib248bb6033c4d0b408773cf0e2a2cda6c2a74371
- Remove execute file permission from source and make files.
- dos2unix conversion.
- Add missing eol at end of files.
Also update .gitignore to not exclude build directory but to
exclude any build_* created by cmake builds.
AMD-Internal: [CPUPL-4415]
Change-Id: I5403290d49fe212659a8015d5e94281fe41eb124
Correction to commit 2450a1813b
to add -DBLIS_CONFIG_FAMILY=zen5 support in cmake.
AMD-Internal: [CPUPL-3518]
Change-Id: Iecff2b64d5d95960cecbbf98d5269133747b122e
- Updating gemm/cgemm_ukernel.cpp to cast integers so that gtestsuite works for ILP64.
- Updating BLIS cmake presets to be conditional on Windows and Linux.
- Updating GTestSuite cmake system to use environment variable to set BLIS_PATH and reference library.
- Add more cmake presets options in gtestsuite.
- Added BUILD_STATIC_LIBS option which is on by default, only on Linux.
- Added TEST_WITH_SHARED option which is off by default, only on Linux.
- If only shared or static lib is being built, that's the one that will be used for testing.
- If both are being built, TEST_WITH_SHARED determins which library wil be used for testing.
- Set linux workflows so that they build both static and shared libs, and use linux-static and linux-shared to denote which one should be used for testing.
- Set -fPIC for both static and shared builds to fix issues faced when building blis using AOCC 4.0.0 and gtestsuite using gcc 9.4.0.
AMD-Internal: [CPUPL-2748]
Change-Id: I4227bab97ff31ecddfe218e18499f33b4e4ee63e
CMakelists.txt is updated to support ASAN to find
memory related errors in blis library. ASAN is enabled
by configuring cmake with the following option .
$ cmake .. -DENABLE_ASAN=ON
ASAN supports only on linux with clang compiler.
And redzone size default size is 16 bytes and maximum
redzone size is 2048 bytes.
$ ASAN_OPTIONS=redzone=2048 <exe>
AMD-Internal: [CPUPL-2748]
Change-Id: I0b70af5c41cf5c68602150daeb67d7432bbe5cb8
CMakelists.txt is Updated to generate code coverage
report in html format just by configuring cmake with
-DENABLE_COVERAGE=ON. Code supports only on linux
with gcc compiler
cmake .. -DENABLE_COVERAGE=ON
AMD-Internal: [CPUPL-2748]
Change-Id: I9b36b6cc3f1f97b53e1c4ee62948a017418e3d41
CMakeLists.txt is updated in blis/testsuite to make it work for
static single thread version of BLIS.
AMD-Internal: [CPUPL-2748]
Change-Id: I004e19d4ddbf9cb94d6d23699893a2f684a3fb35
Some text files were missing a newline at the end of the file.
One has been added.
AMD-Internal: [CPUPL-3519]
Change-Id: I4b00876b1230b036723d6b56755c6ca844a7ffce
User control over code path using AOCL_ENABLE_INSTRUCTIONS
or BLIS_ARCH_TYPE only makes sense for fat binary builds.
Thus this functionality is now disabled by default for
single architecture builds. User can still override the default
selections by using configure options --enable-blis-arch-type
or --disable-blis-arch-type.
Other changes:
- include x86_64 family as using zen codepaths in cmake build system.
- Update help and error messages to include AOCL_ENABLE_INSTRUCTIONS.
AMD-Internal: [CPUPL-4202]
Change-Id: I7aa5fcf89df8675bcc12d81f81781de647e0fcf8