* Added bias_bnorm_clamp instances.
* fwd_bias_bnorm_clamp comp instances
* fwd_bias_bnorm_mem_inter and mem_intra instances
* fwd_bias_bnorm_merged_group_instances
* fwd_bias_bnorm_clamp_conv3d_bf16 and f16 instances
* Device level changes for fwd_bias_bnorm_clamp
* Added the test to the regression test list.
* Removed the part 2 and 2x instances
* Removed the irrelevant checks in wmma
* Refactored the instances to adapt to new device implementation
* Updated the reference and include files
* enabling tests
* Added missing profiler
* Added missing instance entry , deleted by mistake
* Reduce bias bnorm clamp instances to only a single generic one.
* Clean up cmakelists file
* clang-format
* Change bias bnorm clamp tests to use monotone initialization values to avoid tiny off-integer gemm results on RDNA3 from blowing up.
* Renaming some instance lists and add functions to be more standardized.
* Commented out non default instances.
---------
Co-authored-by: kiefer <kiefer.van.teutem@streamhpc.com>