Details :
- SUP Threshold change for native vs SUP
- Improved the ST performances for sizes n<800
- Introduce PACKB in SUP to improve ST performance between 320<n<800
- 16T SUP Tuning for n<1600.
AMD-Internal: [CPUPL-1981]
Change-Id: Ie59afa4d31570eb0edccf760c088deaa2e10cdda
Details:
- Eliminated the IR loop in ref_var2m functions.
- Handled the rectangular and triangular portions of C matrix
separately.
- Added a condition to check and eliminate zero regions inside IC loop.
- modified kc selection logic to choose optimal KC in SUP
- Updated thresholds to choose between SUP and native.
Change-Id: I21908eaa6bc3a8f37bdea29f7bfca7e6fcfee724
Details:
- Adding threshold function pointers to cntx gives flexibility to choose
different threshold functions for different configurations.
- In case of fat binary where configuration is decided at run-time,
adding threshold functions under a macro enables these functions for
all the configs under a family. This can be avoided by adding function
pointers to cntx which can be queried from cntx during run-time
based on the config chosen.
Change-Id: Iaf7e69e45ae5bb60e4d0f75c7542a91e1609773f