- Implemented the feature to benchmark ?ASUMV APIs
for the supported datatypes. The feature allows to
benchmark BLAS, CBLAS or the native BLIS API, based
on the macro definition.
- Added a sample input file to provide examples to benchmark
ASUMV for all its datatype supports.
AMD-Internal: [CPUPL-5984]
Change-Id: Iff512166545687d12504babda1bd52d71a3a5755
- Bug : When configuring our library with the native
BLIS integer size being 32, the bench application
would crash or read an invalid value when parsing
the input file. This is because of a mismatch
of format specifier, that we hardset in the
Makefile.
- Fix : Defined a header that sets the format specifiers
as macros with the right matching, based on how we
configure and build the library. It is expected to
include this header in every source file for
benchmarking.
AMD-Internal: [CPUPL-5895]
Change-Id: I9718c36a1a9fe3eba4d5da419823c16097902d89
- Standardize formatting (spacing etc).
- Add full copyright to cmake files (excluding .json)
- Correct copyright and disclaimer text for frame and
zen, skx and a couple of other kernels to cover all
contributors, as is commonly used in other files.
- Fixed some typos and missing lines in copyright
statements.
AMD-Internal: [CPUPL-4415]
Change-Id: Ib248bb6033c4d0b408773cf0e2a2cda6c2a74371
- Remove execute file permission from source and make files.
- dos2unix conversion.
- Add missing eol at end of files.
Also update .gitignore to not exclude build directory but to
exclude any build_* created by cmake builds.
AMD-Internal: [CPUPL-4415]
Change-Id: I5403290d49fe212659a8015d5e94281fe41eb124
- Implemented the feature to benchmark ?AXPYV APIs
for the supported datatypes. The feature allows to
benchmark BLAS, CBLAS or the native BLIS API, based
on the macro definition.
- Added a sample input file to provide examples to benchmark
AXPYV for all its datatype supports.
- Updated the sample input file for SCALV to provide examples
to benchmark all of its datatype supports.
AMD-Internal: [CPUPL-4805]
Change-Id: I550920e3a57fcc2e4900e9e698330d8b8595bdee
- Added support for 2 new APIs:
1. sgemm_compute()
2. dgemm_compute()
These are dependent on the ?gemm_pack_get_size() and ?gemm_pack()
APIs.
- ?gemm_compute() takes the packed matrix buffer (represented by the
packed matrix identifier) and performs the GEMM operation:
C := A * B + beta * C.
- Whenever the kernel storage preference and the matrix storage
scheme isn't matching, and the respective matrix being loaded isn't
packed either, on-the-go packing has been enabled for such cases to
pack that matrix.
- Note: If both the matrices are packed using the ?gemm_pack() API,
it is the responsibility of the user to pack only one matrix with
alpha scalar and the other with a unit scalar.
- Note: Support is presently limited to Single Thread only. Both, pack
and compute APIs are forced to take n_threads=1.
AMD-Internal: [CPUPL-3560]
Change-Id: I825d98a0a5038d31668d2a4b84b3ccc204e6c158
Some text files were missing a newline at the end of the file.
One has been added.
Also correct file format of windows/tests/inputs.yaml, which
was missed in commit 0f0277e104
AMD-Internal: [CPUPL-2870]
Change-Id: Icb83a4a27033dc0ff325cb84a1cf399e953ec549
- For the cases where AVX2 is available, an optimized function is called,
based on Blue's algorithm. The fallback method based on sumsqv is used
otherwise.
- Scaling is used to avoid overflow and underflow.
- Works correctly for negative increments.
AMD-Internal: [CPUPL-2551]
Change-Id: I5d8976b29b5af463a8981061b2be907ea647123c
1. Added the checks in .c files of the bench folder to read the input parameters from the given input files on windows using fscanf.
Change-Id: Ie0497696304d318f345a646ab0ce3ba84debd4e2
Details:
- Intrinsic implementation of axpbyv for AVX2
- Bench written for axpbyv
- Added definitions in zen contexts
AMD-Internal: [CPUPL-1963]
Change-Id: I9bc21a6170f5c944eb6e9e9f0e994b9992f8b539
- Added bench utility for dotv and scalv API's
- Corrected logging for scalv to handle complex types
- Corrected logging to remove transpose field from dotv logs
AOCL-Internal: [CPUPL-1577]
Change-Id: Ieb29e773309de1520c7fa5b79b97c943d894ba07