Shubham Sharma de92fb0680 Added Memory testing for DTRSM
- Added framework for memory testing.
- Out of bound reads and writes can be
  detected in both C and assembly.
- Added memory tests for DTRSM.
- Test methodology:
  - Use linux's protected pages to set some memory
    before and after the required buffer as protected.
  - Set the first and last page_size bytes as
    read, write and execute protected (red_zones).
  - If any part of code tries to read/write
    in redzones, a SIGSEGV signal will be
    generated, which can be used to detect a
    out of bounds read and write.
  - Page protection can only be set per page.
    If required size for buffer is not a multiple
    of pagesize we have to allocate more memory
    than required in order make sure the start and
    end of redzones align with page boundaries.
  - Overwrite malloc(size) to allocate
    'buffer_size+(2*pagesize)' where buffer_size =
    minimum size such that buffer_size > 'size' and
    buffer_size is multiple of pagesize.
  - Use first and last page_size bytes of allocated
    buffer as redzones, use first 'size' of the middle
    buffer as first greenzone and last 'size' bytes as
    second greenzone.
  - Call test code once with first geenzone and then
    with second greenzone. Greenzones are surrounded
    by redzones, if test code read/writes before or after
    greenzones, it will be detected.

   |_____________________________________________________|
   |  red_zone1 |  green_zone1    greenzone_2 | red_zone2|
   |_____________________________________________________|

AMD-Internal: [CPUPL-4403]
Change-Id: Ic5c22a9adf8f833c77510686eee886485e894354
2024-02-19 23:41:28 -05:00
2024-02-19 23:41:28 -05:00
2019-05-23 12:51:17 -05:00
2023-11-23 08:54:31 -05:00
2023-11-23 08:54:31 -05:00
2019-10-02 10:16:22 +01:00
2021-03-22 17:42:33 -05:00
2023-11-23 08:54:31 -05:00
2024-01-25 04:31:25 -05:00
2023-11-10 13:05:12 -05:00
2018-08-07 14:21:07 -05:00
2023-11-23 08:54:31 -05:00
2023-11-23 08:54:31 -05:00
2023-05-25 14:46:33 +00:00
2023-08-08 07:27:41 -04:00
2023-08-08 07:27:41 -04:00

AOCL-BLAS library

AOCL-BLAS is AMD's optimized version of BLAS targeted for AMD EPYC and Ryzen CPUs. It is developed as a forked version of BLIS (https://github.com/flame/blis), which is developed by members of the Science of High-Performance Computing (SHPC) group in the Institute for Computational Engineering and Sciences at The University of Texas at Austin and other collaborators (including AMD). All known features and functionalities of BLIS are retained and supported in AOCL-BLAS library. AOCL-BLAS is regularly updated with the improvements from the upstream repository.

AOCL BLAS is optimized with SSE2, AVX2, AVX512 instruction sets which would be enabled based on the target Zen architecture using the dynamic dispatch feature. All prominent Level 3, Level 2 and Level 1 APIs are designed and optimized for specific paths targeting different size spectrums e.g., Small, Medium and Large sizes. These algorithms are designed and customized to exploit the architectural improvements of the target platform.

For detailed instructions on how to configure, build, install, and link against AOCL-BLAS on AMD CPUs, please refer to the AOCL User Guide located on AMD developer portal.

The upstream repository (https://github.com/flame/blis) contains further information on BLIS, including background information on BLIS design, usage examples, and a complete BLIS API reference.

AOCL-BLAS is developed and maintained by AMD. You can contact us on the email-id toolchainsupport@amd.com. You can also raise any issue/suggestion on the git-hub repository at https://github.com/amd/blis/issues.

Description
BLAS-like Library Instantiation Software Framework
Readme BSD-3-Clause 72 MiB
Languages
C 86.3%
C++ 9.5%
Fortran 1.9%
Makefile 0.8%
MATLAB 0.5%
Other 0.9%