Commit Graph

62 Commits

Author SHA1 Message Date
Po Yen Chen
a11a56e54c Rename check_err() parameter 2024-04-09 14:58:25 +00:00
Po Yen Chen
93f608123d Rename check_err() parameter 2024-04-09 14:02:13 +00:00
Po Yen Chen
b71b9d2159 Add equal<float> & equal<double> 2024-04-09 13:49:31 +00:00
Po Yen Chen
54c28861fe Reuse the existing template less_equal<> in check_err() 2024-04-09 13:39:18 +00:00
Po Yen Chen
3f57b3068a Extend less_equal<> 2024-04-09 13:38:40 +00:00
Po Yen Chen
c6eac9746f Fix type errors in composes<> 2024-04-09 13:18:17 +00:00
Po Yen Chen
35e2d18e5e Merge branch 'ck_tile/elementwise' of github.com:ROCm/composable_kernel into ck_tile/elementwise 2024-04-09 12:50:40 +00:00
Po Yen Chen
a66409cfd9 Unify saturates<> implementation 2024-04-09 12:49:07 +00:00
rocking
83b8a99018 Merge branch 'ck_tile/refactor' into ck_tile/elementwise 2024-04-09 19:45:43 +08:00
carlushuang
89a75a97fa fix some bug in group-mode masking and codegen. update README 2024-04-09 19:01:25 +00:00
Po Yen Chen
ecc64bce12 Generalize the composes<> template 2024-04-09 10:14:56 +00:00
Po Yen Chen
6ed739f913 Fix wrong value produced by saturating 2024-04-09 09:27:58 +00:00
Po Yen Chen
5d0ebdbfe4 Re-use already-existing scales<> functor template 2024-04-09 08:06:38 +00:00
Po Yen Chen
ad45cf8613 Support heterogeneous argument for binary function types 2024-04-09 07:41:30 +00:00
Po Yen Chen
a9adfbe54a Small refinements in C++ source files 2024-04-09 06:45:03 +00:00
Po Yen Chen
20fcd69687 Remove not-in-use elementwise function kargs 2024-04-09 06:03:35 +00:00
rocking
5860f3134a Merge branch 'ck_tile/refactor' into ck_tile/elementwise 2024-04-09 02:37:42 +08:00
Po Yen Chen
87f3cd1ddd Use CK_TILE_FLOAT_TO_FP8_STANDARD as default fp8 rounding mode 2024-04-08 12:39:58 +00:00
Po Yen Chen
641ae96215 Check fp8 rounding error in check_err() 2024-04-08 12:39:27 +00:00
Po Yen Chen
92d45d1681 Fix wrong fp8 QK/KV block gemm setting 2024-04-08 12:39:17 +00:00
rocking
4e005f2457 Avoid warning 2024-04-08 10:11:51 +00:00
rocking
29a0670744 Remove remove_cvref_t 2024-04-08 10:03:48 +00:00
rocking
5c3fdeb0b8 Remove f8 pipeline, we should share the same pipeline even in f8 2024-04-08 09:56:23 +00:00
rocking
f7d81364f3 To prevent compiler issue, remove the elementwise function we have not used. 2024-04-08 09:44:21 +00:00
carlushuang
42ebffe822 1).support receipe in generate.py 2).use simplified mask type 3).change left/right to pass into karg 2024-04-07 23:30:34 +00:00
rocking
68153dea0b Let generate.py can generate different elementwise function 2024-04-04 03:59:38 +00:00
rocking
d6cb104d0f Add some elementwise op, prepare to quantization 2024-04-04 03:18:39 +00:00
rocking
d9323ea261 Fix bug of elementwise op, our elementwise op is not inout 2024-04-04 03:17:36 +00:00
rocking
bfcf550305 Adjust P elementwise function 2024-04-03 11:07:21 +00:00
rocking
cf57626c07 Merge branch 'ck_tile/refactor' into ck_tile/elementwise 2024-04-01 16:07:27 +08:00
carlushuang
42866940dc remove mistake 2024-03-31 00:01:30 +00:00
carlushuang
855a264b72 remove ck_tile example from default cmake target like all/install/check 2024-03-30 23:58:48 +00:00
rocking
286c74468d Add element function to fmha api 2024-03-29 18:05:36 -04:00
rocking
50c36f352a Add SAccElementFunction, PComputeElementFunction, OAccElementFunction in pipeline 2024-03-29 07:09:06 -04:00
carlushuang
13311f2e5a fix clang-format 2024-03-26 18:53:10 +00:00
carlushuang
c94b545747 update some readme 2024-03-26 13:35:53 +00:00
carlushuang
200d2b22d4 fix scratch in fp8 kernel 2024-03-25 19:45:38 +00:00
Po-Yen, Chen
1cacb713c5 Default use CK_TILE_FLOAT_TO_FP8_STOCHASTIC rounding mode 2024-03-23 22:51:18 -04:00
carlushuang
bb1f6e48eb fix fp8 duplicated move/shift/and/or problem 2024-03-19 23:29:57 +00:00
carlushuang
886d040a81 fix compile error, fp8 not ready now 2024-03-18 07:58:00 +00:00
carlushuang
f55c7629bc not using custom data type by default, now we can have ISA-level same code as opt_padding 2024-03-17 23:23:32 +00:00
carlushuang
ee397d0ab2 temp fix buffer_store spill 2024-03-15 22:56:41 +00:00
carlushuang
04762d212b make sure thread_buffer can be tuple/array 2024-03-13 22:03:42 +00:00
carlushuang
616932068d let more integral_constant->constant, and formating 2024-03-13 18:33:10 +00:00
Po-Yen, Chen
b1dbf64c91 Some minor changes 2024-03-13 03:55:07 -04:00
Po-Yen, Chen
8d1631adc9 Re-use function 2024-03-13 03:38:12 -04:00
Po-Yen, Chen
60221b89f8 Add constraint to array<> ctor 2024-03-13 03:32:05 -04:00
Po-Yen, Chen
5c433432fd Fix format 2024-03-13 03:21:30 -04:00
Po-Yen, Chen
958218e9d0 Rename enum
Rename 'cood_transform_enum' to 'coord_transform_enum'
2024-03-13 03:15:04 -04:00
carlushuang
d962a0044b fix compile issue in transpose 2024-03-13 15:02:45 +00:00