mirror of
https://github.com/amd/blis.git
synced 2026-05-24 10:24:34 +00:00
Fixed blastest failure for 'generic' subconfig.
Details: - Fixed a subtle and complicated bug that only manifested via the BLAS test drivers in the generic subconfiguration, and possibly any other subconfiguration that did not register complex-domain gemm ukernels, or registered ONLY real-domain ukernels as row-preferential. This is a long story, but it boils down to an exception to the "transpose the operation to bring storage of C into agreement with ukernel pref" optimization in bli_hemm_front.c and bli_symm_front.c sabotaging the proper functioning of the 1m method, but only when the imaginary component of beta is zero. See the comments in issue #342 for more details. Thanks to Dave Love for identifying the commit in which this bug was introduced, and other feedback related to this bug.
This commit is contained in:
@@ -111,7 +111,8 @@ void bli_hemm_front
|
||||
// contiguous columns, or if C is stored by columns and the micro-kernel
|
||||
// prefers contiguous rows, transpose the entire operation to allow the
|
||||
// micro-kernel to access elements of C in its preferred manner.
|
||||
if ( !bli_obj_is_1x1( &c_local ) )
|
||||
//if ( !bli_obj_is_1x1( &c_local ) ) // NOTE: This conditional should NOT
|
||||
// be enabled. See issue #342 comments.
|
||||
if ( bli_cntx_l3_vir_ukr_dislikes_storage_of( &c_local, BLIS_GEMM_UKR, cntx ) )
|
||||
{
|
||||
bli_toggle_side( &side );
|
||||
|
||||
@@ -111,7 +111,8 @@ void bli_symm_front
|
||||
// contiguous columns, or if C is stored by columns and the micro-kernel
|
||||
// prefers contiguous rows, transpose the entire operation to allow the
|
||||
// micro-kernel to access elements of C in its preferred manner.
|
||||
if ( !bli_obj_is_1x1( &c_local ) )
|
||||
//if ( !bli_obj_is_1x1( &c_local ) ) // NOTE: This conditional should NOT
|
||||
// be enabled. See issue #342 comments.
|
||||
if ( bli_cntx_l3_vir_ukr_dislikes_storage_of( &c_local, BLIS_GEMM_UKR, cntx ) )
|
||||
{
|
||||
bli_toggle_side( &side );
|
||||
|
||||
@@ -129,7 +129,8 @@ void bli_trmm_front
|
||||
// micro-kernel to access elements of C in its preferred manner.
|
||||
// NOTE: We disable the optimization for 1x1 matrices since the concept
|
||||
// of row- vs. column storage breaks down.
|
||||
if ( !bli_obj_is_1x1( &c_local ) )
|
||||
//if ( !bli_obj_is_1x1( &c_local ) ) // NOTE: This conditional should NOT
|
||||
// be enabled. See issue #342 comments.
|
||||
if ( bli_cntx_l3_vir_ukr_dislikes_storage_of( &c_local, BLIS_GEMM_UKR, cntx ) )
|
||||
{
|
||||
bli_toggle_side( &side );
|
||||
|
||||
Reference in New Issue
Block a user