Bartłomiej Kocot
4d8fce33dd
Add SplitK support into Batched GEMM V3 (#1729)
* add bmm api
* add bf16 multi_d
* add ckProfiler for bf16
* add ckProfiler files
* add more instance; fixed 64bit index issue
* fixed naming
* enabled batched Ds
* use long_index for ds offsets
* clean
* add bmm fp8 ckProfiler
* Update example/24_batched_gemm/batched_gemm_xdl_bf16_v3.cpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update example/24_batched_gemm/batched_gemm_xdl_fp8_rowwise_v3.cpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update example/24_batched_gemm/run_batched_gemm_example_rowwise.inc
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update library/src/tensor_operation_instance/gpu/gemm_universal_batched/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn.hpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update library/src/tensor_operation_instance/gpu/gemm_universal_batched/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v1_default_instance.cpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update library/src/tensor_operation_instance/gpu/gemm_universal_batched/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_mem_v2_default_instance.cpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update profiler/src/profile_gemm_universal_batched.cpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* Update profiler/include/profiler/profile_gemm_universal_batched_impl.hpp
Co-authored-by: Bartłomiej Kocot <bartlomiejkocot98@gmail.com>
* clean
* Update include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp
* Update include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp
* Update library/src/tensor_operation_instance/gpu/gemm_universal_batched/device_batched_gemm_xdl_universal_bf16_bf16_bf16/device_batched_gemm_xdl_universal_bf16_bf16_bf16_mk_nk_mn_comp_default_instance.cpp
* Update include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp
* Update include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp
* Update include/ck/tensor_operation/gpu/device/impl/device_batched_gemm_multiple_d_xdl_cshuffle_v3.hpp
* refactor batch offset func
* add splitk suppport into bmm_v3
* clean
* clean
* format
* fixed
* fix
---------
Co-authored-by: Jing Zhang <jizhan@fb.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
2024-12-13 21:08:35 +01:00
..
2024-08-06 10:06:10 +02:00
2024-09-12 11:47:52 +02:00
2023-08-10 12:04:35 +08:00
2024-06-27 00:33:34 -07:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-06-27 00:33:34 -07:00
2024-05-17 10:42:51 -07:00
2024-12-13 21:08:35 +01:00
2024-05-17 10:42:51 -07:00
2024-06-27 00:33:34 -07:00
2024-05-17 10:42:51 -07:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2023-11-30 15:09:27 -06:00
2023-11-30 15:09:27 -06:00
2023-11-30 15:09:27 -06:00
2024-10-12 14:05:11 +08:00
2024-08-06 10:06:10 +02:00
2024-06-18 10:26:49 +02:00
2024-06-27 00:33:34 -07:00
2024-06-18 10:26:49 +02:00
2023-08-18 11:14:59 +08:00
2024-05-17 10:42:51 -07:00
2024-05-17 10:42:51 -07:00
2024-05-17 10:42:51 -07:00
2024-05-17 10:42:51 -07:00
2024-05-17 10:42:51 -07:00
2023-05-31 18:46:57 -05:00
2024-05-17 10:42:51 -07:00
2024-06-27 00:33:34 -07:00
2024-05-17 10:42:51 -07:00
2024-04-19 13:31:17 +02:00
2023-05-31 18:46:57 -05:00
2024-04-19 13:31:17 +02:00
2024-06-27 00:33:34 -07:00
2023-07-26 07:19:55 -07:00
2024-06-27 00:33:34 -07:00
2024-05-10 09:41:39 -07:00
2024-04-26 07:26:30 -05:00
2024-06-27 00:33:34 -07:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-06-27 00:33:34 -07:00
2023-11-25 13:35:22 +01:00
2024-07-19 22:06:52 +08:00
2024-09-04 20:58:54 -07:00
2024-03-05 17:08:43 -08:00
2024-05-17 10:42:51 -07:00
2024-06-27 00:33:34 -07:00
2023-12-03 23:08:47 +01:00
2024-11-21 08:21:37 -08:00
2024-01-19 07:02:22 -06:00
2024-09-04 20:58:54 -07:00
2024-07-19 22:01:22 +08:00
2023-11-07 09:09:58 -06:00
2024-05-17 10:42:51 -07:00
2024-05-17 10:42:51 -07:00
2024-02-07 01:08:34 +01:00
2024-02-12 09:45:42 -08:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-02-02 11:35:26 -08:00
2024-12-06 10:55:23 +01:00
2024-12-06 10:55:23 +01:00
2024-09-03 10:52:03 +02:00
2024-09-03 10:52:03 +02:00
2024-12-03 08:42:55 -08:00
2024-09-03 10:52:03 +02:00
2024-09-03 10:52:03 +02:00
2024-11-05 09:59:08 -08:00
2024-08-06 10:06:10 +02:00
2024-09-20 10:45:46 +02:00
2024-10-26 15:22:37 +02:00
2024-08-06 10:06:10 +02:00
2023-05-31 18:46:57 -05:00
2024-08-06 10:06:10 +02:00
2024-04-03 09:08:08 -05:00
2024-09-03 10:52:03 +02:00
2024-09-20 10:45:46 +02:00
2024-10-04 17:32:43 +02:00
2024-12-02 09:13:56 +01:00
2024-12-02 09:13:56 +01:00
2024-12-02 09:13:56 +01:00
2024-02-02 11:35:26 -08:00
2024-11-27 13:02:44 +01:00
2024-12-02 09:13:56 +01:00
2024-12-02 09:13:56 +01:00
2024-06-27 00:33:34 -07:00
2024-08-06 10:06:10 +02:00
2024-04-19 13:31:17 +02:00
2024-06-27 00:33:34 -07:00
2023-05-31 18:46:57 -05:00
2023-05-31 18:46:57 -05:00
2023-12-19 04:23:11 +08:00
2023-12-19 04:23:11 +08:00
2023-12-19 04:23:11 +08:00
2023-12-19 04:23:11 +08:00
2023-05-31 18:46:57 -05:00
2024-09-11 15:21:00 +02:00
2023-08-15 02:25:28 +08:00
2023-06-19 09:44:22 -05:00
2024-08-13 16:15:47 +02:00
2024-08-13 16:15:47 +02:00
2024-08-13 16:15:47 +02:00
2024-08-13 16:15:47 +02:00
2023-10-11 14:27:29 -05:00
2023-05-31 18:46:57 -05:00
2024-02-02 11:35:26 -08:00