From d6bb56d088c0e881ea733120c349d06fe124cd0d Mon Sep 17 00:00:00 2001 From: sraut Date: Wed, 19 Dec 2018 21:21:10 +0530 Subject: [PATCH] Fixed BLAS test failures of small matrix SYRK for single and double precision. Details: - SYRK for small matrix was implemented by reusing small GEMM routine. This was resulting in output written to the full C matrix, and C being symmetric the lower and upper triangles of C matrix contained same results. BLAS SYRK API spec demands either lower or upper triangle of C matrix to be written with results. So, this was resulting in BLAS test failures, even though testsuite of BLIS was passing small SYRK operation. - To fix BLAS test failures of small matrix SYRK, separate kernel routines are implemented for small SYRK for both single and double precision. The newly added small SYRK routines are in file kernels/zen/3/bli_syrk_small.c. Now the intermediate results of matrix C are written to a scratch buffer. Final results are written from scratch buffer to matrix C using SIMD copy to either lower or upper traingle part of matrix C. - Source and header files frame/3/syrk/bli_syrk_front.c and frame/3/syrk/bli_syrk_front.h are changed to invoke new small SYRK routines. Change-Id: I9cfb1116c93d150aefac673fca033952ecac97cb --- config/zen/bli_cntx_init_zen.c | 1 - frame/3/syrk/bli_syrk_front.h | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/config/zen/bli_cntx_init_zen.c b/config/zen/bli_cntx_init_zen.c index b7ce799af..59385af14 100644 --- a/config/zen/bli_cntx_init_zen.c +++ b/config/zen/bli_cntx_init_zen.c @@ -157,7 +157,6 @@ void bli_cntx_init_zen( cntx_t* cntx ) bli_blksz_init_easy( &blkszs[ BLIS_NC ], 4080, 4080, 4080, 4080 ); #else - bli_blksz_init_easy( &blkszs[ BLIS_MC ], 144, 240, 144, 72 ); bli_blksz_init_easy( &blkszs[ BLIS_KC ], 256, 512, 256, 256 ); bli_blksz_init_easy( &blkszs[ BLIS_NC ], 4080, 2040, 4080, 4080 ); diff --git a/frame/3/syrk/bli_syrk_front.h b/frame/3/syrk/bli_syrk_front.h index 98b1e1251..0b65303cc 100644 --- a/frame/3/syrk/bli_syrk_front.h +++ b/frame/3/syrk/bli_syrk_front.h @@ -5,6 +5,7 @@ libraries. Copyright (C) 2014, The University of Texas at Austin + Copyright (C) 2018, Advanced Micro Devices, Inc. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are @@ -42,7 +43,6 @@ void bli_syrk_front rntm_t* rntm, cntl_t* cntl ); - err_t bli_syrk_small ( obj_t* alpha, @@ -52,4 +52,4 @@ err_t bli_syrk_small obj_t* c, cntx_t* cntx, cntl_t* cntl - ); \ No newline at end of file + );