Several improvements to BLIS DTL functionality
- For APIs that report performance statistics, test for time=0.0
before dividing by time when calculating GFLOPS.
- Call AOCL_DTL_TRACE_EXIT in the parameter checking functions
inlined from ./frame/compat/check/bla_*_check.h
- Correct flop count for complex routines.
AMD-Internal: [CPUPL-3736]
Change-Id: Icc515d88810dd79e66e22ea8c47d84649ca9f768