Bugfix: Tuned zgemm threshold for zen4 (#129)

* Bugfix: Tuned zgemm threshold for zen4

Threshold tuning that determines whether SUP or native path should
be used for given input matrix size.

This tuning forces skinny matrices to take SUP path to ensure better
performance.

* Bugfix: Tuned zgemm threshold for zen4 and zen5

Threshold tuning that determines whether SUP or native path should
be used for given input matrix size.

This tuning forces skinny matrices to take SUP path to ensure better
performance.

---------

Co-authored-by: harsdave <harsdave@amd.com>
This commit is contained in:
Dave, Harsh
2025-08-13 19:02:39 +05:30
committed by GitHub
parent da875888d7
commit fa69528a3b
2 changed files with 2 additions and 2 deletions

View File

@@ -87,7 +87,7 @@ bool bli_cntx_gemmsup_thresh_is_met_zen4( obj_t* a, obj_t* b, obj_t* c, cntx_t*
// The threshold for m is a single value, but for n, it is
// also based on the packing size of A, since the kernels are
// column preferential
if( ( m <= 1380 ) || ( n <= 1520 ) || ( k <= 128 ) ) return TRUE;
if( ( ( ( m <= 3400 ) || ( n <= 1800 ) ) && ( k <= 128 ) ) && ( m + n + k < 6400 ) ) return TRUE;
return FALSE;
}

View File

@@ -87,7 +87,7 @@ bool bli_cntx_gemmsup_thresh_is_met_zen5( obj_t* a, obj_t* b, obj_t* c, cntx_t*
// The threshold for m is a single value, but for n, it is
// also based on the packing size of A, since the kernels are
// column preferential
if( ( m <= 1380 ) || ( n <= 1520 ) || ( k <= 128 ) ) return TRUE;
if( ( ( ( m <= 3400 ) || ( n <= 1800 ) ) && ( k <= 128 ) ) && ( m + n + k < 6400 ) ) return TRUE;
return FALSE;
}