- Added support for 2 new APIs:
1. sgemm_compute()
2. dgemm_compute()
These are dependent on the ?gemm_pack_get_size() and ?gemm_pack()
APIs.
- ?gemm_compute() takes the packed matrix buffer (represented by the
packed matrix identifier) and performs the GEMM operation:
C := A * B + beta * C.
- Whenever the kernel storage preference and the matrix storage
scheme isn't matching, and the respective matrix being loaded isn't
packed either, on-the-go packing has been enabled for such cases to
pack that matrix.
- Note: If both the matrices are packed using the ?gemm_pack() API,
it is the responsibility of the user to pack only one matrix with
alpha scalar and the other with a unit scalar.
- Note: Support is presently limited to Single Thread only. Both, pack
and compute APIs are forced to take n_threads=1.
AMD-Internal: [CPUPL-3560]
Change-Id: I825d98a0a5038d31668d2a4b84b3ccc204e6c158
Some text files were missing a newline at the end of the file.
One has been added.
Also correct file format of windows/tests/inputs.yaml, which
was missed in commit 0f0277e104
AMD-Internal: [CPUPL-2870]
Change-Id: Icb83a4a27033dc0ff325cb84a1cf399e953ec549
- For the cases where AVX2 is available, an optimized function is called,
based on Blue's algorithm. The fallback method based on sumsqv is used
otherwise.
- Scaling is used to avoid overflow and underflow.
- Works correctly for negative increments.
AMD-Internal: [CPUPL-2551]
Change-Id: I5d8976b29b5af463a8981061b2be907ea647123c
1. Added the checks in .c files of the bench folder to read the input parameters from the given input files on windows using fscanf.
Change-Id: Ie0497696304d318f345a646ab0ce3ba84debd4e2
Details:
- Intrinsic implementation of axpbyv for AVX2
- Bench written for axpbyv
- Added definitions in zen contexts
AMD-Internal: [CPUPL-1963]
Change-Id: I9bc21a6170f5c944eb6e9e9f0e994b9992f8b539
- Added bench utility for dotv and scalv API's
- Corrected logging for scalv to handle complex types
- Corrected logging to remove transpose field from dotv logs
AOCL-Internal: [CPUPL-1577]
Change-Id: Ieb29e773309de1520c7fa5b79b97c943d894ba07