Files
blis/frame/thread
mkadavil 457c33a601 Eliminating barriers in SUP path when matrices are not packed.
-Current gemm SUP path uses bli_thrinfo_sup_grow, bli_thread_range_sub
to generate per thread data ranges at each loop of gemm algorithm.
bli_thrinfo_sup_grow involves usage of multiple barriers for cross
thread synchronization. These barriers are necessary in cases where
either the A or B matrix are packed for centralized pack buffer
allocation/deallocation (bli_thread_am_ochief thread).

-However for cases where both A and B matrices are unpacked, these
barrier are resulting in overhead for smaller dimensions. Here creation
of unnecessary communicators are avoided and subsequently the
requirement for barriers are eliminated when packing is disabled for
both the input matrices in SUP path.

Change-Id: Ic373dfd2d6b08b8f577dc98399a83bb08f794afa
2022-01-06 01:56:43 -05:00
..
2021-06-17 05:17:37 -04:00
2021-03-08 19:04:17 +05:30