mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
Details: 1. Fixed the issues corresponding to Out of bound memory access during load and store. 2. In Intrinsic code: i. AVX2 Registers can hold 4 double elements. ii. In case of remainder when number of elements is lessthan vectorised register. Though the required number of elements are lessthan 4, we are reading and writing in chunks of 4 elements due to vectorization. This might cause out of bound memory access. 3. Redesigned code to restrict out of bound access by loading and storing the exact number of elements required. AMD-Internal: [SWLCSG-1470] Change-Id: I786f8023cf5a5f3e5343bea413c59bd0e764df9b