Files
blis/frame
Hari Govind 9c26de1a18 Optimisiation COPYV APIs
- Implemented AVX512 kernels for scopyv_, dcopyv_ and  zcopyv_
  using respective AVX512 intrinsics including masked
  load and store operations.

- Implemented AVX512 kernels for scopy_, dcopy_ and
  zcopy_ using assembly language to prevent loss of
  performance during the translation of intrinsics.

- Updated the dcopy_blis_impl( ... ) and
  zcopy_blis_impl( ... ) function to support
  multithreaded calls to the respective computational
  kernels, if and when the OpenMP support is enabled.

- Implemented OpenMP parallelization for dcopyv_ and
  zcopyv_ APIs, while scopyv_ and ccopyv_ only support
  single thread.

AMD-Internal: [CPUPL-4854]
Change-Id: I5fbd0bcca4e59001fbe2b1168b624d0c33242b3e
2024-05-01 00:23:01 +05:30
..
2023-11-23 08:54:31 -05:00
2024-04-12 07:26:31 -04:00
2024-05-01 00:23:01 +05:30
2024-05-01 00:23:01 +05:30
2023-11-23 08:54:31 -05:00
2023-11-23 08:54:31 -05:00