* add support for mixed precision bf16&int8 grouped gemm * fix gfx versions and add bf16 kbatch condition * added reviewers comments