Commit Graph

  • f952d3571c Force both Gemm0 and Gemm1 to use mfma-16x16x32 on gfx950 Qianfeng Zhang 2025-11-28 13:45:20 +00:00
  • 74d3173d15 Merge commit 'f981554c39eafbf993e05c832cb86b3aaf474571' into develop assistant-librarian[bot] 2025-11-28 13:21:12 +00:00
  • 77407b3d26 [CK_TILE] Fix Quant GEMM build (#3320) Sami Remes 2025-11-28 12:33:53 +00:00
  • 1c73a3d480 [CK_TILE] Fix Quant GEMM build (#3320) Sami Remes 2025-11-28 12:33:53 +00:00
  • f981554c39 [CK_TILE] Fix Quant GEMM build (#3320) Sami Remes 2025-11-28 12:33:53 +00:00
  • 8d25f267ad Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Tianxing Wu 2025-11-28 12:04:31 +00:00
  • 4aa2407cf0 update yadaish 2025-11-28 11:31:25 +00:00
  • 0642090655 update yadaish 2025-11-27 14:47:51 +00:00
  • 412b42cd1b update yadaish 2025-11-25 16:21:37 +00:00
  • fe690d88fd remove unused varable max_accumulated_value from example mohsen saffari 2025-11-21 14:51:39 +01:00
  • 1039244779 clang format correction mohsen saffari 2025-11-21 14:39:32 +01:00
  • 79d7583b1d removed unused rtol_atol variable from example code mohsen saffari 2025-11-21 14:36:24 +01:00
  • 94ef537be1 correct clang-format mohsen saffari 2025-11-19 15:32:39 +01:00
  • d321b3486e Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic Mohsen Saffari 2025-11-19 14:02:24 +00:00
  • 125b7997d5 change endian seems working yadaish 2025-11-25 05:23:47 +00:00
  • da42b5af91 update yadaish 2025-11-24 10:59:09 +00:00
  • 7b2e154c3c update yadaish 2025-11-24 10:45:56 +00:00
  • 33f41e5ff7 update yadaish 2025-11-24 09:57:31 +00:00
  • 1d7e3a5d99 fix out of lds yadaish 2025-11-24 09:32:44 +00:00
  • 151acb3027 support a16_wint4 moe yadaish 2025-11-19 04:13:41 +00:00
  • b99c48da2e mixed-prec flatmm pipeline improve yadaish 2025-11-18 10:12:11 +00:00
  • cf39173c71 update dev/yadai yadaish 2025-11-28 11:31:25 +00:00
  • c4fbbaaa55 Add trait file for update_moving_average and save_mean_inv_std Mohsen Saffari 2025-11-28 10:33:58 +00:00
  • 9302ea4ace Merge branch 'develop' into tests_for_batched_grouped_gemm tests_for_batched_grouped_gemm Aleksander Dudek 2025-11-28 10:28:57 +00:00
  • c79879c669 splitk hack pass oscar 2025-11-28 17:41:29 +08:00
  • 296bf24afd Merge commit 'f875ab0bbc6ea68a689a688a58f9a53ad12fd536' into develop assistant-librarian[bot] 2025-11-28 09:13:31 +00:00
  • 990f13229f [CK_TILE] Add pooling to ckTileEngine part2 Aleksander Dudek 2025-11-28 08:56:17 +00:00
  • 4f5a48c910 Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic-add (#3236) msaffari-amd 2025-11-28 09:43:01 +01:00
  • 1055485a38 Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic-add (#3236) msaffari-amd 2025-11-28 09:43:01 +01:00
  • f875ab0bbc Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic-add (#3236) msaffari-amd 2025-11-28 09:43:01 +01:00
  • bb3d2f5be6 update zhimding/moe_flatmm_async zhimding 2025-11-28 08:07:30 +00:00
  • 6032baee56 Merge commit '30727c48fcdf2178f013cbb843db563abd77d09c' into develop assistant-librarian[bot] 2025-11-27 23:12:24 +00:00
  • fa1c7bc6ba Tile engine for streamk (#3157) Cong Ma 2025-11-27 15:49:57 -07:00
  • f622d546f3 Tile engine for streamk (#3157) Cong Ma 2025-11-27 15:49:57 -07:00
  • 30727c48fc Tile engine for streamk (#3157) Cong Ma 2025-11-27 15:49:57 -07:00
  • a73761408e Fix missing copyright notices. John Afaganis 2025-11-27 13:56:13 -07:00
  • d0b319035a Merge commit '24d88d24729cc097d6d0c87a839827f40e35d86a' into develop assistant-librarian[bot] 2025-11-27 17:12:03 +00:00
  • a3d6a1cb26 [CK_TILE] Move DataTypeTraits into a Common File (#3146) arai713 2025-11-27 09:09:54 -08:00
  • 6d28d12b62 [CK_TILE] Move DataTypeTraits into a Common File (#3146) arai713 2025-11-27 09:09:54 -08:00
  • 24d88d2472 [CK_TILE] Move DataTypeTraits into a Common File (#3146) arai713 2025-11-27 09:09:54 -08:00
  • a0e4315d4e Use 16x16x32 for Gemm1 on MI350 and adjust the NumPrefetchK for with_softmax trload pipeline Qianfeng Zhang 2025-11-27 15:30:53 +00:00
  • 60ca9484b4 refined benchmarking Tianxing Wu 2025-11-27 15:07:03 +00:00
  • b732595d9f add kSaveMeanInvStd, kUpdateMovingAverage in Traits Mohsen Saffari 2025-11-27 15:06:13 +00:00
  • cbbfa46ce7 update yadaish 2025-11-27 14:47:51 +00:00
  • 5c8e8684ec add gamma, bias to the simple kernel Mohsen Saffari 2025-11-27 13:32:18 +00:00
  • 3131ebf1df simplified kernel pid logic Tianxing Wu 2025-11-27 13:28:35 +00:00
  • 99f5c2fcf7 [CK_TILE] Add pooling to ckTileEngine part1 Aleksander Dudek 2025-11-27 11:31:53 +00:00
  • b8b8b6fe11 Merge branch 'develop' into moe_xcd_remap Tianxing Wu 2025-11-27 12:33:00 +02:00
  • eeb419845d fmha v3 flops calculation Tianxing Wu 2025-11-27 10:32:28 +00:00
  • a36fd78205 Enhance fp16/i8, remove unnecessary instances from 8bit types Wojciech Laskowski 2025-11-27 10:08:06 +00:00
  • c641d0d42c non zero calculation fix Tianxing Wu 2025-11-27 09:24:52 +00:00
  • 6a2ac8f758 causal mask fix Tianxing Wu 2025-11-27 09:16:30 +00:00
  • 69c97c06d7 Add hstu_attention_api.hpp to explicitly mark the API interfaces and update REAMD.md Qianfeng Zhang 2025-11-27 08:04:47 +00:00
  • c27dc5875d Merge commit '678298d4c7141d41a552e7d8fea396ee88a4652f' into develop assistant-librarian[bot] 2025-11-27 08:15:41 +00:00
  • 6c993365ac Add support for gfx1153 (#3306) Matthias Gehre 2025-11-27 08:48:00 +01:00
  • f3e08fa9d6 Add support for gfx1153 (#3306) Matthias Gehre 2025-11-27 08:48:00 +01:00
  • 678298d4c7 Add support for gfx1153 (#3306) Matthias Gehre 2025-11-27 08:48:00 +01:00
  • b80314a150 turbo config rocm7.1_gg_performance kyle-256 2025-11-27 03:13:38 +00:00
  • a3422f31e3 Merge commit 'a38aeceb2164f9d1807bda1a19d59636bafd4f31' into develop assistant-librarian[bot] 2025-11-27 02:44:03 +00:00
  • 6f751b7a9b Fix and improve the gemm quant pipeline infrastructure (#3245) Thomas Ning 2025-11-26 18:04:27 -08:00
  • 997a24f3d0 Fix and improve the gemm quant pipeline infrastructure (#3245) Thomas Ning 2025-11-26 18:04:27 -08:00
  • a38aeceb21 Fix and improve the gemm quant pipeline infrastructure (#3245) Thomas Ning 2025-11-26 18:04:27 -08:00
  • e7c7922385 Merge commit '79aae7c7f71404bdb80d6db52bc6401e0e221d42' into develop assistant-librarian[bot] 2025-11-27 00:36:02 +00:00
  • a7a9ccdeca [CK Tile] enable building examples by default (#3259) Max Podkorytov 2025-11-26 16:24:44 -08:00
  • 0ce4a61da5 [CK Tile] enable building examples by default (#3259) Max Podkorytov 2025-11-26 16:24:44 -08:00
  • 79aae7c7f7 [CK Tile] enable building examples by default (#3259) Max Podkorytov 2025-11-26 16:24:44 -08:00
  • d790a9f9de Automated Perfetto UI Notifications (#3255) andrew clark 2025-11-26 16:27:27 -07:00
  • 544525b61d Automated Perfetto UI Notifications (#3255) andrew clark 2025-11-26 16:27:27 -07:00
  • 40d7217ac7 Automated Perfetto UI Notifications (#3255) andrew clark 2025-11-26 16:27:27 -07:00
  • e8f2a1bfb9 Update rocm-docs-core to 1.30.0 ROCm Docs Automation 2025-11-26 17:10:46 -05:00
  • 3abaa3a1d2 Update rocm-docs-core to 1.30.0 docs/7.1.0 ROCm Docs Automation 2025-11-26 16:38:40 -05:00
  • 01136111f7 Merge pull request #3308 from spolifroni-amd/spolifroni-amd/ck-cherrypick-docs-711 spolifroni-amd 2025-11-26 13:31:35 -05:00
  • 5e1667f082 removed an extra newline that caused an issue spolifroni-amd 2025-11-26 13:28:41 -05:00
  • d07400e78b Improving the contribution page (#2804) spolifroni-amd 2025-09-09 15:24:44 -04:00
  • 2b28edd417 first commit of the glossary (#2702) spolifroni-amd 2025-09-08 13:55:32 -04:00
  • 2044d0dd35 Merge commit 'de6466481f9472350a5f4afce27c86ecdbb5b42f' into develop assistant-librarian[bot] 2025-11-26 18:14:59 +00:00
  • 05c9e2ac07 Merge branch 'develop' into enable_persistent_async Manish Kumar 2025-11-26 23:42:23 +05:30
  • 42ed693761 Resolve PR comments Manish Kumar 2025-11-26 18:08:28 +00:00
  • 216c23b945 chore(copyright): update copyright header for include directory (#3293) Aviral Goel 2025-11-26 22:00:05 +04:00
  • ee7a68b10f chore(copyright): update copyright header for include directory (#3293) Aviral Goel 2025-11-26 22:00:05 +04:00
  • de6466481f chore(copyright): update copyright header for include directory (#3293) Aviral Goel 2025-11-26 22:00:05 +04:00
  • 90e0eb4dfc Fix template parameter macros (#3305) John Shumway 2025-11-26 09:48:17 -08:00
  • d449d96c98 Fix template parameter macros (#3305) John Shumway 2025-11-26 09:48:17 -08:00
  • 10a782d846 Fix template parameter macros (#3305) John Shumway 2025-11-26 09:48:17 -08:00
  • 94e0c2465f Addressing Copilot review comments ThruptiRajLakshmanaGowda 2025-11-26 15:01:07 +00:00
  • 283383c61c Merge commit '35a4b26af0088ca0d634b57055a4143fdb9f2e2d' into develop assistant-librarian[bot] 2025-11-26 07:13:26 +00:00
  • 612f91226f fix: add dynamic selection of pipelines for aquant mode (#3282) Aviral Goel 2025-11-26 10:58:09 +04:00
  • cb0aaf8e90 fix: add dynamic selection of pipelines for aquant mode (#3282) Aviral Goel 2025-11-26 10:58:09 +04:00
  • 35a4b26af0 fix: add dynamic selection of pipelines for aquant mode (#3282) Aviral Goel 2025-11-26 10:58:09 +04:00
  • cba72006f8 Merge branch 'develop' into tileengine-restructure Thrupti Raj Lakshmana Gowda 2025-11-26 00:47:42 -06:00
  • 7140a27bbe Resolving merge Conflicts ThruptiRajLakshmanaGowda 2025-11-26 06:46:27 +00:00
  • 50262a6e37 Resolving merge conflicts ThruptiRajLakshmanaGowda 2025-11-26 06:33:50 +00:00
  • 6b20e4cd6a Resolving merge conflicts ThruptiRajLakshmanaGowda 2025-11-26 06:24:11 +00:00
  • cb1bea4929 splitk kick-off. Compilation fail oscar 2025-11-26 09:56:08 +08:00
  • 2b060900e7 Resolving merge conflicts ThruptiRajLakshmanaGowda 2025-11-26 05:29:25 +00:00
  • a86762f0f9 Merge commit '8fa90025d0da22683dabe721d77a75a536388683' into develop assistant-librarian[bot] 2025-11-26 03:34:44 +00:00
  • 16dd90a523 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272) Yi DING 2025-11-26 10:57:15 +08:00
  • 631655adb1 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272) Yi DING 2025-11-26 10:57:15 +08:00
  • 8fa90025d0 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272) Yi DING 2025-11-26 10:57:15 +08:00
  • 9eb4b35ef6 Merge commit 'c7dce2ac29136939b6fe6aabadd026e53dcf35c9' into develop assistant-librarian[bot] 2025-11-26 02:44:11 +00:00