Details:
-Kernel is called directly from API call to avoid framework
overhead in case of single and double precisions.
-Currently these changes are applicable only for zen2 configuration.
They will be enabled for zen family processors in future.
-These changes improve performance of BLAS and CBLAS interfaces of API.
They do not affect BLIS-specific APIs.
Change-Id: I1eb7ca470ced82c3cfa8b22f2b53000d42fef96c
Signed-off-by: Meghana Vankadari <Meghana.Vankadari@amd.com>
AMD-Internal: [CPUPL-847,CPUPL-816]