mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
1. Corrected B buffer accessing to access by its offset instead of starting address which is required incase of MT. 3. When num_threads > 1, B buffer is divided in to blocks in m or n dimension based on side right or left. Hence need to access by its offset to access starting of the block. 4. Currently B Matrix is divided in to blocks for each thread and complete matrix A is used by all threads. Incase of design change in future, modified A buffer accessing by its offset to support partition of matrix A for MT AMD-Internal:[CPUPL-2520] Change-Id: Ic09e9e945417b86e2bc2e2d4548f65db308cd2ea