Details: k0 is always positive in bli_dgemm_haswell_asm_6x8(), the operation involved with
k0 is typecasted to uint64_t to enable AOCC generate optimized code.
Thanks for Jini Susan (jinisusan.george@amd.com) from compiler team for suggesting
this change. Similar change was applied to sgemm, cgemm and zgemm kernels.
Change-Id: I423c949e0c1835652142a6931dadf4a7d190aeb9