Commit Graph

  • 5d1e915a8f Update cmake/SetupDocs.cmake pmaybank 2025-10-27 11:11:13 +00:00
  • a0847290d8 make a start on RDNA / Navi specific doc Philip Maybank 2025-10-20 12:25:54 +01:00
  • c121d5a4c4 Merge branch 'develop' into philipm/documentation-cleanup-5 pmaybank 2025-10-27 10:51:53 +00:00
  • a464269bb6 Fix in the comments Qianfeng Zhang 2025-10-27 10:36:15 +00:00
  • 4eeb5cc917 Update to gemm_0's CBlockDistribution encoding so that it is compatible with gemm_1's ABlockDistribution encoding Qianfeng Zhang 2025-10-27 10:34:45 +00:00
  • 4fdde500eb change tilem moe_block_m_32 felix 2025-10-26 08:24:24 +00:00
  • a35bc01f27 add instance for tokens<=8 felix 2025-10-26 07:59:49 +00:00
  • 2484317b9d use 512 for k, slitghtly better felix 2025-10-26 01:34:26 +00:00
  • c29228adcf Merge commit '6d709dac41409a339b82a83ea59e03fbb37c7005' into develop assistant-librarian[bot] 2025-10-25 15:11:17 +00:00
  • dff4005b7b [CK Builder] Add missing tf32 type to reflection. (#3090) John Shumway 2025-10-25 07:28:12 -07:00
  • a3261e87a3 [CK Builder] Add missing tf32 type to reflection. (#3090) John Shumway 2025-10-25 07:28:12 -07:00
  • 6d709dac41 [CK Builder] Add missing tf32 type to reflection. (#3090) John Shumway 2025-10-25 07:28:12 -07:00
  • dc6d0327f9 [CK_Builder] Add name member to unary elementwise ops & update builder traits. (#3093) Adam Osewski 2025-10-25 16:27:03 +02:00
  • 75a0f41bb0 [CK_Builder] Add name member to unary elementwise ops & update builder traits. (#3093) Adam Osewski 2025-10-25 16:27:03 +02:00
  • f53d857b25 [CK_Builder] Add name member to unary elementwise ops & update builder traits. (#3093) Adam Osewski 2025-10-25 16:27:03 +02:00
  • 8003e1b024 [CK_BUILDER] Add inline string diff for tests (#3067) kabrahamAMD 2025-10-25 16:22:41 +02:00
  • 93a92cf2da [CK_BUILDER] Add inline string diff for tests (#3067) kabrahamAMD 2025-10-25 16:22:41 +02:00
  • e576992dca [CK_BUILDER] Add inline string diff for tests (#3067) kabrahamAMD 2025-10-25 16:22:41 +02:00
  • f48439f4c7 moe gemm2ok felix 2025-10-25 13:58:24 +00:00
  • 37fdb18e06 fix build felix 2025-10-25 13:43:51 +00:00
  • 1fe3c20ef2 [CK_TILE] fmha: Unify sequence length and padding handling zain/TE-native-bshd-thd Jeff Huang 2025-10-24 15:46:06 +08:00
  • eeffd2717a Adapt fmha_bwd_runner.cpp to new q, kv sequence padding Add backward q/kv sequence padding unit tests. Jeff Huang 2025-10-18 22:51:04 +08:00
  • 43d1245490 fix clang format illsilin_amdeng 2025-10-17 10:06:08 -07:00
  • 4e06eaa417 [CK_TILE] fmha: Add query padding support to backward pass Jeff Huang 2025-10-08 15:35:02 +08:00
  • e4c35a1432 moe_block_m_128 moe_block_m_128 xudoyuan 2025-10-25 15:22:16 +08:00
  • 4494721174 Merge commit '86d542f663201d7923c56cd8e31d46e01c4dcfcf' into develop assistant-librarian[bot] 2025-10-24 20:13:08 +00:00
  • edce1db08f Merge branch 'develop' into jzhou/pre-load-ds jzhou/pre-load-ds Thomas Ning 2025-10-24 12:17:26 -07:00
  • 3ecd2a8689 [CK-Tile][Async gemm] add missing sync and f8 inputs test cases (#3000) Max Podkorytov 2025-10-24 12:16:01 -07:00
  • 26c4304c84 [CK-Tile][Async gemm] add missing sync and f8 inputs test cases (#3000) Max Podkorytov 2025-10-24 12:16:01 -07:00
  • 86d542f663 [CK-Tile][Async gemm] add missing sync and f8 inputs test cases (#3000) Max Podkorytov 2025-10-24 12:16:01 -07:00
  • e7707c32d1 Merge commit '05843995715ee1e83e95906654a8210e1450b83d' into develop assistant-librarian[bot] 2025-10-24 18:15:08 +00:00
  • 2498b499a1 [CK_TILE] Adding support for TiledPermuteN on preshuffle Block Scale Gemm (#3019) Khushbu Agarwal 2025-10-24 11:06:51 -07:00
  • eef9513fd3 [CK_TILE] Adding support for TiledPermuteN on preshuffle Block Scale Gemm (#3019) Khushbu Agarwal 2025-10-24 11:06:51 -07:00
  • 0584399571 [CK_TILE] Adding support for TiledPermuteN on preshuffle Block Scale Gemm (#3019) Khushbu Agarwal 2025-10-24 11:06:51 -07:00
  • ca3d00dcbe Update generate.py xyt/ln_patch Yutao Xu 2025-10-25 00:52:25 +08:00
  • 69bbe0480b config block_m = 32 xudoyuan 2025-10-24 16:13:55 +00:00
  • 0d4c6c2c13 Merge commit 'f39626fcf72d0188946040fe6441437415707343' into develop assistant-librarian[bot] 2025-10-24 16:13:23 +00:00
  • 99ad6f60e4 [CK][host] limit the rotating count to prevent oom (#3089) Max Podkorytov 2025-10-24 08:55:54 -07:00
  • a1681b077e [CK][host] limit the rotating count to prevent oom (#3089) Max Podkorytov 2025-10-24 08:55:54 -07:00
  • f39626fcf7 [CK][host] limit the rotating count to prevent oom (#3089) Max Podkorytov 2025-10-24 08:55:54 -07:00
  • c67f3501b0 limit the rotating count to prevent oom (#3087) Max Podkorytov 2025-10-24 08:55:34 -07:00
  • 77fc1e4c3f limit the rotating count to prevent oom (#3087) Max Podkorytov 2025-10-24 08:55:34 -07:00
  • fdcc1f75c3 limit the rotating count to prevent oom (#3087) Max Podkorytov 2025-10-24 08:55:34 -07:00
  • 2550111808 Merge commit '775b96ea6a8bb0d82d635dc1a396c8d98091c832' into develop assistant-librarian[bot] 2025-10-24 15:12:08 +00:00
  • 07d67497ff Fixing Run CI Check for Changed Files (#3072) andrew clark 2025-10-24 08:52:43 -06:00
  • c47b82b103 Fixing Run CI Check for Changed Files (#3072) andrew clark 2025-10-24 08:52:43 -06:00
  • 775b96ea6a Fixing Run CI Check for Changed Files (#3072) andrew clark 2025-10-24 08:52:43 -06:00
  • c4448c9d7c [CK_TILE] add tensorwise quant in grouped gemm (#3007) kyle-256 2025-10-24 22:41:54 +08:00
  • b49f5d9de5 [CK_TILE] add tensorwise quant in grouped gemm (#3007) kyle-256 2025-10-24 22:41:54 +08:00
  • 3c12a02827 [CK_TILE] add tensorwise quant in grouped gemm (#3007) kyle-256 2025-10-24 22:41:54 +08:00
  • 22c5c20977 Debugging window size Tianxing Wu 2025-10-24 09:45:32 +00:00
  • 52434da15a Merge commit '6bbc05e1bd1f1dd1bcc61a1e815f470cd4c9ac7f' into develop assistant-librarian[bot] 2025-10-24 09:13:29 +00:00
  • 115ba5ece4 mxpf4 moe block_m 32 xudoyuan 2025-10-24 16:54:26 +08:00
  • fe303e69b6 Update GetTypeString for grouped bwd wei explicit gemm barkocot/explicit-string-out Bartlomiej Kocot 2025-07-29 14:47:36 +00:00
  • 6a7861bbec conv:tf32:add missed instances (#3081) yinglu 2025-10-24 16:28:36 +08:00
  • 480f05ffd9 conv:tf32:add missed instances (#3081) yinglu 2025-10-24 16:28:36 +08:00
  • 6bbc05e1bd conv:tf32:add missed instances (#3081) yinglu 2025-10-24 16:28:36 +08:00
  • 98a241a2eb Using separate tile settings for no-softmax and with-softmax hstu attention situations Qianfeng Zhang 2025-10-23 14:45:28 +00:00
  • 1ee7564ae5 implement gemm universal with a visitor Max Podkorytov 2025-10-23 14:57:21 -05:00
  • 02a1856f3a add boilerplate Max Podkorytov 2025-10-21 17:41:23 -05:00
  • 9fde1d98ad Merge commit 'd0364641ed7f7520ed0163e4768d900b8c07af7a' into develop assistant-librarian[bot] 2025-10-23 20:13:04 +00:00
  • e316ba18ed [CK_BUILDER] old ck build fixes (#3075) Robin Voetter 2025-10-23 22:01:19 +02:00
  • 88771b5f47 [CK_BUILDER] old ck build fixes (#3075) Robin Voetter 2025-10-23 22:01:19 +02:00
  • d0364641ed [CK_BUILDER] old ck build fixes (#3075) Robin Voetter 2025-10-23 22:01:19 +02:00
  • 96942c824f Excluding Tile engine from build (#3085) Thrupti Raj Lakshmana Gowda 2025-10-23 14:57:18 -05:00
  • 9a2f0f82b4 Excluding Tile engine from build (#3085) Thrupti Raj Lakshmana Gowda 2025-10-23 14:57:18 -05:00
  • 0fd7d1a607 Excluding Tile engine from build (#3085) Thrupti Raj Lakshmana Gowda 2025-10-23 14:57:18 -05:00
  • 0e6a5289fa adding commit hash (#3084) Geo Min 2025-10-23 12:32:26 -07:00
  • 2dc3dad0a0 adding commit hash (#3084) Geo Min 2025-10-23 12:32:26 -07:00
  • 2546fc241e adding commit hash (#3084) Geo Min 2025-10-23 12:32:26 -07:00
  • 8505bc05c9 Merge commit 'fe4eaeb2eb28088e07d7c7e5f8bd7499831a427c' into develop assistant-librarian[bot] 2025-10-23 19:11:30 +00:00
  • 5338925d70 Use filename but not path to filter compilation (#3083) Yi DING 2025-10-24 03:01:26 +08:00
  • 048edb2776 Use filename but not path to filter compilation (#3083) Yi DING 2025-10-24 03:01:26 +08:00
  • fe4eaeb2eb Use filename but not path to filter compilation (#3083) Yi DING 2025-10-24 03:01:26 +08:00
  • 0bd24dbfbf Merge commit 'bedade257241fef37a28c6e540e73f1c056d27b9' into develop assistant-librarian[bot] 2025-10-23 18:15:09 +00:00
  • 43fd710da0 Merge branch 'wip_355' into wip_355_xcd_remap wip_355_xcd_remap Illia Silin 2025-10-23 11:10:17 -07:00
  • d6933e661d [CK_TILE] Add fp4 warp gemm 16x16x128 (#2738) Gino Lu 2025-10-24 01:55:51 +08:00
  • 7e4c021e26 [CK_TILE] Add fp4 warp gemm 16x16x128 (#2738) Gino Lu 2025-10-24 01:55:51 +08:00
  • bedade2572 [CK_TILE] Add fp4 warp gemm 16x16x128 (#2738) Gino Lu 2025-10-24 01:55:51 +08:00
  • 1df6f6af8e Rearrange pointers to fix the reinterpret_cast issue (#3077) Rostyslav Geyyer 2025-10-23 12:54:13 -05:00
  • e6dc79dcc6 Rearrange pointers to fix the reinterpret_cast issue (#3077) Rostyslav Geyyer 2025-10-23 12:54:13 -05:00
  • 6df69abeef Rearrange pointers to fix the reinterpret_cast issue (#3077) Rostyslav Geyyer 2025-10-23 12:54:13 -05:00
  • 6ad906b040 [CK_TILE] Fix in set_slice_tile (#2232) Qianfeng 2025-10-24 01:34:02 +08:00
  • cf31de9211 [CK_TILE] Fix in set_slice_tile (#2232) Qianfeng 2025-10-24 01:34:02 +08:00
  • fbd101b1ac [CK_TILE] Fix in set_slice_tile (#2232) Qianfeng 2025-10-24 01:34:02 +08:00
  • 89cfdb35e0 Fixed block Q with M Tianxing Wu 2025-10-23 12:02:18 +00:00
  • d18f8e46bf Fixed block Q with M Tianxing Wu 2025-10-23 12:02:10 +00:00
  • ebf1c4c305 const blockq Tianxing Wu 2025-10-23 11:57:36 +00:00
  • 3bb29bfd6c Fixed pipeline args Tianxing Wu 2025-10-23 11:49:35 +00:00
  • 6ea56bec34 removed redundent code Tianxing Wu 2025-10-23 11:44:14 +00:00
  • 3fe5d793b6 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Tianxing Wu 2025-10-23 11:42:18 +00:00
  • e03ed35944 fix the vector max Tianxing Wu 2025-10-23 11:42:15 +00:00
  • 5bf72d2bcb fixing bugs Juuso Korhonen 2025-10-23 11:40:48 +00:00
  • 3bcef59536 block table stride fix Tianxing Wu 2025-10-23 11:25:07 +00:00
  • 0d2a9badba fixed example Tianxing Wu 2025-10-23 11:17:46 +00:00
  • 7c4012266a Update to benchmark scripts to consider for using softmax Qianfeng Zhang 2025-10-23 10:02:22 +00:00
  • 3c0e6d37bf fixing bugs Juuso Korhonen 2025-10-23 09:47:30 +00:00
  • e144872308 change to BLOCK_M in shape definitions Juuso Korhonen 2025-10-23 08:11:55 +00:00
  • 6c1c433260 Merge commit 'b9789a0742e4623a109472fad567ccea14c7ed89' into develop assistant-librarian[bot] 2025-10-23 07:13:33 +00:00
  • c37371e3ef [CK][Examples] Fixing stride issues in ck examples by workaround - Bypassing hostTensor validation. Michal Kulikowski 2025-10-16 13:01:24 +02:00