Gemm reduce examples int4/int8/fp32/bf16 (#368)

* GEMM + Reduce max fp16+fp32 * GEmm + Max bf16 + int8 * Refactor common definitions. * Refactor common func of mean meansquare example. * More examples for mean meansquare. * Update int8 examples and skip them cause of random errors. * Int4 examples. * Fix examples for max int4/8 * Tensor conversion for int4 input data for mean meansquare example. * Remove int4 mean_meansquare example * Fix int8 mean_meansquare example. -All ReductionAccData and R<N>DataType have to be F32. The INT32 data type is giving wrong results. * Guard int4 with ifdef * Change int8 example to add_addsquare due to div rounding err. * Clang format * Change the return type of common function. * Get back int8 example with division. * Remove int8 mean meansquare. * Use proper cast for BF16 data type. * Use ck::literals. * Use proper data type for host tensors & reference. - Use ReduceAccDataType for reference gemm output data type. - Cast host reference output tensor to EDataType - Fix ifdefs for int4. Co-authored-by: Adam Osewski <aosewski@amd.com> [ROCm/composable_kernel commit: d00e6115b9]
2026-05-16 10:59:55 +00:00 · 2022-08-30 18:38:26 +02:00
parent 6446894289
commit 0cc91c73d8
13 changed files with 2141 additions and 359 deletions
--- a/library/include/ck/library/utility/host_tensor.hpp
+++ b/library/include/ck/library/utility/host_tensor.hpp
@@ -259,7 +259,7 @@ struct Tensor
        Tensor<OutT> ret(mDesc);
        for(size_t i = 0; i < mData.size(); i++)
        {
-            ret.mData[i] = static_cast<OutT>(mData[i]);
+            ret.mData[i] = ck::type_convert<OutT>(mData[i]);
        }
        return ret;
    }