Commit Graph

  • 829eabed3a Fix fwd factories after refactoring. Ville Pietilä 2026-01-05 04:42:31 -05:00
  • 1dcea1825f Fix DeviceGroupedConvBwdWeightMultipleD_Wmma_CShuffleV3 factory and compute types for input and output tensor in bwd weigth convs. Ville Pietilä 2026-01-05 03:09:22 -05:00
  • c5abac7854 Merge commit 'e339101e9c9961fe1bc8305d5c316b39d1980d3e' into develop assistant-librarian[bot] 2026-01-04 12:20:15 +00:00
  • ece5bd6435 [CK-Tile] move out memory operation from cshuffle epilogue class (#3359) Max Podkorytov 2026-01-04 03:28:14 -08:00
  • 6cf89bbca9 [CK-Tile] move out memory operation from cshuffle epilogue class (#3359) Max Podkorytov 2026-01-04 03:28:14 -08:00
  • e339101e9c [CK-Tile] move out memory operation from cshuffle epilogue class (#3359) Max Podkorytov 2026-01-04 03:28:14 -08:00
  • a3b06e7190 initial test setup sk_tests_backup Astha Rai 2026-01-04 10:56:45 +00:00
  • 814b609476 Update build analyzer for better usability John Shumway 2025-12-18 19:48:34 -05:00
  • 94daf3aa65 Improve DataFrame schema John Shumway 2025-12-15 08:24:05 -05:00
  • 0caf06e6f1 Add build trace analysis tools for -ftime-trace data John Shumway 2025-12-14 12:56:07 -05:00
  • d8734393c7 Merge commit 'ec23be0b9d45ff9ca4135090bcd0269184c953a7' into develop assistant-librarian[bot] 2026-01-03 06:16:07 +00:00
  • 3830186287 Update unsigned long literals and format specifiers to work correctly in Windows (#3483) John Afaganis 2026-01-02 22:16:41 -07:00
  • 077d75cea0 Update unsigned long literals and format specifiers to work correctly in Windows (#3483) John Afaganis 2026-01-02 22:16:41 -07:00
  • ec23be0b9d Update unsigned long literals and format specifiers to work correctly in Windows (#3483) John Afaganis 2026-01-02 22:16:41 -07:00
  • 0b05cd0351 Merge commit '4670df5ca606e6e3ee07a085ea61016489bf91ad' into develop assistant-librarian[bot] 2026-01-03 01:41:33 +00:00
  • ef899ca9ed [CK_BUILDER] Remove cmath include (#3508) John Shumway 2026-01-02 16:58:35 -08:00
  • 9e9cadefb5 [CK_BUILDER] Remove cmath include (#3508) John Shumway 2026-01-02 16:58:35 -08:00
  • 4670df5ca6 [CK_BUILDER] Remove cmath include (#3508) John Shumway 2026-01-02 16:58:35 -08:00
  • e64da4f3d6 Merge commit '355ce9230d9c4f2e74776e879f2bee71a26bae4a' into develop assistant-librarian[bot] 2026-01-02 23:12:46 +00:00
  • d1d1a55542 Remove non-standard M_PI (#3507) John Shumway 2026-01-02 14:21:46 -08:00
  • 853f3c6776 Remove non-standard M_PI (#3507) John Shumway 2026-01-02 14:21:46 -08:00
  • 355ce9230d Remove non-standard M_PI (#3507) John Shumway 2026-01-02 14:21:46 -08:00
  • 1b3eb980bf Merge commit '1da340031c98bfde0f142bf34493d087490ec70d' into develop assistant-librarian[bot] 2026-01-02 21:11:42 +00:00
  • 7e3274671b Enable math defines for MSVC. (#3503) John Shumway 2026-01-02 11:36:42 -08:00
  • 86b1f5749b Enable math defines for MSVC. (#3503) John Shumway 2026-01-02 11:36:42 -08:00
  • 1da340031c Enable math defines for MSVC. (#3503) John Shumway 2026-01-02 11:36:42 -08:00
  • 40e1482964 Update TheRock CI SHA 20260102 (#3506) Joseph Macaranas 2026-01-02 14:23:43 -05:00
  • 506a19a7e7 Update TheRock CI SHA 20260102 (#3506) Joseph Macaranas 2026-01-02 14:23:43 -05:00
  • cc1392a405 Update TheRock CI SHA 20260102 (#3506) Joseph Macaranas 2026-01-02 14:23:43 -05:00
  • 4eea42c5b7 Add factory for DeviceGroupedConvBwdWeightMultipleD_Wmma_CShuffleV3 Ville Pietilä 2026-01-02 09:52:52 -05:00
  • c3a9044bad Dispatching for DeviceGroupedConvBwdWeightMultipleD_Wmma_CShuffle. Ville Pietilä 2026-01-02 09:25:38 -05:00
  • 1759db7250 Factory and tests for DeviceGroupedConvBwdWeight_Wmma_CShuffle. Ville Pietilä 2026-01-02 08:52:38 -05:00
  • aa10d659b7 Added factory and tests for DeviceGroupedConvBwdWeightTwoStage_Wmma_CShuffleV3. Ville Pietilä 2026-01-02 07:51:40 -05:00
  • 74297016ea Update example/ck_tile/42_unified_attention/README.md Tianxing Wu 2026-01-02 14:29:33 +02:00
  • a1973b472f Update example/ck_tile/42_unified_attention/README.md Tianxing Wu 2026-01-02 14:29:05 +02:00
  • efb7ada2a6 Update example/ck_tile/42_unified_attention/README.md Tianxing Wu 2026-01-02 14:28:16 +02:00
  • 788890ac83 Update example/ck_tile/42_unified_attention/README.md Tianxing Wu 2026-01-02 14:23:15 +02:00
  • 6a62216c24 Update example/ck_tile/42_unified_attention/README.md Tianxing Wu 2026-01-02 14:20:56 +02:00
  • 6635c875dc Merge branch 'develop' into tianxing/unified-attention Tianxing Wu 2026-01-02 14:12:52 +02:00
  • 89934275f4 Fix WMMA bwd weight tests. Ville Pietilä 2026-01-02 07:07:08 -05:00
  • 2e43e16e47 Update Readme. Ville Pietilä 2026-01-02 07:06:51 -05:00
  • bc3cba873b Add factory and tests for DeviceGroupedConvBwdWeight_Wmma_CShuffleV3. Ville Pietilä 2026-01-02 06:08:29 -05:00
  • d045923a0d Refactor large tensor support and WMMA configuration. Ville Pietilä 2026-01-02 05:12:19 -05:00
  • 09e188f2a8 Treat ref algorithm the same way as real algorithms in the dispatcher. Ville Pietilä 2026-01-02 02:28:05 -05:00
  • 5be1ed65fb Merge remote-tracking branch 'origin/develop' into vpietila/ckb-bwd-weight-factories Ville Pietilä 2026-01-02 02:15:24 -05:00
  • 6b6bc88064 Merge commit '6e8c401e33676ccc21992c849e73640a383d288c' into develop assistant-librarian[bot] 2026-01-01 00:43:10 +00:00
  • c04359ebf1 [CK_BUILDER] Instance traits for conv bwd weight algorithms (#3498) Ville Pietilä 2025-12-31 15:41:15 -08:00
  • ba9dbd433a [CK_BUILDER] Instance traits for conv bwd weight algorithms (#3498) Ville Pietilä 2025-12-31 15:41:15 -08:00
  • 6e8c401e33 [CK_BUILDER] Instance traits for conv bwd weight algorithms (#3498) Ville Pietilä 2025-12-31 15:41:15 -08:00
  • eef8d8bda7 Remove another patch users/dahawkin/revert-f86bbb1aefdd047b2b0e886dda831417e790f622 Daryl Hawkins 2025-12-31 17:31:14 -05:00
  • 2d8878c43d Remove rocm-libraries patch that was applied directly to rocm-libraries in https://github.com/ROCm/rocm-libraries/pull/3561 Daryl Hawkins 2025-12-31 17:14:28 -05:00
  • 3c6fcefbf0 Revert "[CK_Builder] [testing] Integrate device random generators (#3427)" Daryl Hawkins 2025-12-31 15:17:16 -05:00
  • 21391d4406 Merge commit 'f3e4d46faa5f3ce4d81c86121782d8a9aea27c5e' into develop assistant-librarian[bot] 2025-12-31 20:13:22 +00:00
  • 25df0256f4 Temporarily disable kernel instances that won't build on gfx1101 on Windows (#3499) DarylHawkinsAMD 2025-12-31 13:12:45 -07:00
  • 67b61ccf5c Temporarily disable kernel instances that won't build on gfx1101 on Windows (#3499) DarylHawkinsAMD 2025-12-31 13:12:45 -07:00
  • f3e4d46faa Temporarily disable kernel instances that won't build on gfx1101 on Windows (#3499) DarylHawkinsAMD 2025-12-31 13:12:45 -07:00
  • fba80401d1 Add factory for DeviceGroupedConvBwdWeightMultipleD_Xdl_CShuffle Ville Pietilä 2025-12-31 09:32:58 -05:00
  • 83be9c740c Add test for creating DeviceGroupedConvBwdWeightMultipleD_Xdl_CShuffle instance. Ville Pietilä 2025-12-31 09:08:06 -05:00
  • e1b4acd431 Final implementation for bwd weight DL factory. Ville Pietilä 2025-12-31 08:29:11 -05:00
  • 3b0777f629 Conv bwd weight DL factory. Ville Pietilä 2025-12-31 07:38:52 -05:00
  • ae31f7c0d7 seems working yadai/moe_a4w4 yadaish 2025-12-31 12:33:48 +00:00
  • 75710202ab Added factory for DeviceGroupedConvBwdWeightTwoStage_Xdl_CShuffle. Ville Pietilä 2025-12-31 04:32:28 -05:00
  • 14c7b15fc2 Merge commit 'f86bbb1aefdd047b2b0e886dda831417e790f622' into develop assistant-librarian[bot] 2025-12-30 18:15:59 +00:00
  • d6361ee5ed [CK_Builder] [testing] Integrate device random generators (#3427) kabrahamAMD 2025-12-30 19:03:05 +01:00
  • d7f7c1b6db [CK_Builder] [testing] Integrate device random generators (#3427) kabrahamAMD 2025-12-30 19:03:05 +01:00
  • f86bbb1aef [CK_Builder] [testing] Integrate device random generators (#3427) kabrahamAMD 2025-12-30 19:03:05 +01:00
  • ee8ec5af8d Merge commit '2b8302eb6d2217c0f537c28538265f4003ec416e' into develop assistant-librarian[bot] 2025-12-30 16:14:01 +00:00
  • 0ecba120e0 Fix grouped conv wrw kernels names (#3494) Bartłomiej Kocot 2025-12-30 16:45:39 +01:00
  • c5245882c3 Fix grouped conv wrw kernels names (#3494) Bartłomiej Kocot 2025-12-30 16:45:39 +01:00
  • 2b8302eb6d Fix grouped conv wrw kernels names (#3494) Bartłomiej Kocot 2025-12-30 16:45:39 +01:00
  • a71a7b2d83 Grouped convolution backward data WMMA v3 implementation (#3460) ApoorvaKalyani 2025-12-30 16:25:08 +01:00
  • bac1ccbf8b Grouped convolution backward data WMMA v3 implementation (#3460) ApoorvaKalyani 2025-12-30 16:25:08 +01:00
  • 53a1e4f551 Grouped convolution backward data WMMA v3 implementation (#3460) ApoorvaKalyani 2025-12-30 16:25:08 +01:00
  • 30c10e2544 Build new instance traits unit tests but exclude WMMA for now. Ville Pietilä 2025-12-30 05:52:55 -05:00
  • adfab9db7e Add unit tests for instance strings. Ville Pietilä 2025-12-30 05:52:15 -05:00
  • 3c1e2b0170 Add instance traits for bwd weight algorithms. Ville Pietilä 2025-12-30 04:29:38 -05:00
  • 6850e5e7bb Merge commit 'dae85ead64c16b34eaa643d09fb0d6da008ca814' into develop assistant-librarian[bot] 2025-12-29 15:14:37 +00:00
  • a57f8d8b67 [CK_TILE] support split-k a16w4 gemm1 (#3389) yadaish 2025-12-29 23:05:35 +08:00
  • fc3ffa0d75 [CK_TILE] support split-k a16w4 gemm1 (#3389) yadaish 2025-12-29 23:05:35 +08:00
  • dae85ead64 [CK_TILE] support split-k a16w4 gemm1 (#3389) yadaish 2025-12-29 23:05:35 +08:00
  • 3e16fa072f Test fix. Ville Pietilä 2025-12-29 09:53:47 -05:00
  • ab88cee0eb Add instance traits for DeviceGroupedConvBwdWeight_Xdl_CShuffleV3. Ville Pietilä 2025-12-29 09:53:07 -05:00
  • a83790e9da Build conv bwd weigth v3 instances successfully. Ville Pietilä 2025-12-29 09:30:58 -05:00
  • a90c72d560 Merge commit 'a0acc83a72c84a8cdbbdef6f397e617ac040aa72' into develop assistant-librarian[bot] 2025-12-29 14:13:26 +00:00
  • 80f44824f5 Add bwd weight XDL CShuffle V3 factory. Ville Pietilä 2025-12-29 09:12:14 -05:00
  • 89e943a9f3 [CK_BUILDER] Add GPU Reference Algorithm to CK Builder (#3381) JH-Leon-KIM-AMD 2025-12-29 16:11:08 +02:00
  • 3772cf9dd4 [CK_BUILDER] Add GPU Reference Algorithm to CK Builder (#3381) JH-Leon-KIM-AMD 2025-12-29 16:11:08 +02:00
  • a0acc83a72 [CK_BUILDER] Add GPU Reference Algorithm to CK Builder (#3381) JH-Leon-KIM-AMD 2025-12-29 16:11:08 +02:00
  • 0b6dde06c3 Merge commit '88ae4455806efe2019bb0403606f7c4a1e3d9c3a' into develop assistant-librarian[bot] 2025-12-29 12:22:38 +00:00
  • 277981bc9b Clean-up CK Tile builder tests. Ville Pietilä 2025-12-29 07:15:08 -05:00
  • 9926d942e9 Separate bwd weigth and bwd data tests into separate targets. Ville Pietilä 2025-12-29 07:03:55 -05:00
  • ac28f1b016 Replace grouped conv bwd wei wmmaV3 bilin/scale bf16f32bf16 support with bf16bf16bf16 (#3470) Kiefer van Teutem 2025-12-29 12:58:29 +01:00
  • 04d4dd1ada Replace grouped conv bwd wei wmmaV3 bilin/scale bf16f32bf16 support with bf16bf16bf16 (#3470) Kiefer van Teutem 2025-12-29 12:58:29 +01:00
  • 88ae445580 Replace grouped conv bwd wei wmmaV3 bilin/scale bf16f32bf16 support with bf16bf16bf16 (#3470) Kiefer van Teutem 2025-12-29 12:58:29 +01:00
  • 52086b350a Fix smoke tests. Ville Pietilä 2025-12-29 05:44:51 -05:00
  • 3bd0f05081 Fix fwd conv builder tests. Ville Pietilä 2025-12-29 05:31:35 -05:00
  • 027d943b2f Update conv specialization enum. Ville Pietilä 2025-12-29 05:06:39 -05:00
  • 30a9686877 Update compiletime diagnostics to use the size type. Ville Pietilä 2025-12-29 04:56:22 -05:00
  • 8c80e005bd Introduve a common size type for concepts. Ville Pietilä 2025-12-29 04:53:19 -05:00
  • ff2fdd8acc Improve concept diagnostics. Ville Pietilä 2025-12-29 04:26:06 -05:00