Commit Graph

  • f5573f56d9 Add attention sink support for FMHA FWD (#3368) Linjun-AMD 2025-12-15 12:21:59 +08:00
  • ea731b5f29 Merge commit '22b945e06ea4b4de188d7ff4ec7ae4bf127be9f9' into develop assistant-librarian[bot] 2025-12-14 22:12:40 +00:00
  • eeb78c46a4 [CK_TILE] Stream-K Tree Reduction and Cache Skipping Integration (#3371) Emily Martins 2025-12-14 14:49:49 -07:00
  • 5ca871a397 [CK_TILE] Stream-K Tree Reduction and Cache Skipping Integration (#3371) Emily Martins 2025-12-14 14:49:49 -07:00
  • 22b945e06e [CK_TILE] Stream-K Tree Reduction and Cache Skipping Integration (#3371) Emily Martins 2025-12-14 14:49:49 -07:00
  • ca5fb0a3b7 Merge commit '9ac51aa0f44bae776609036f291c3cd2666e84ee' into develop assistant-librarian[bot] 2025-12-14 21:11:46 +00:00
  • a3270d2eb0 Add describe() method to device ops for runtime introspection (#3375) John Shumway 2025-12-14 12:49:12 -08:00
  • 1f97dc1aee Add describe() method to device ops for runtime introspection (#3375) John Shumway 2025-12-14 12:49:12 -08:00
  • 9ac51aa0f4 Add describe() method to device ops for runtime introspection (#3375) John Shumway 2025-12-14 12:49:12 -08:00
  • d0b4a2a403 Merge commit '21f06aa47ded64b9a07d81bf4b743c21462178db' into develop assistant-librarian[bot] 2025-12-14 19:11:55 +00:00
  • 5c81464568 CK Tile: Enable padding blockscale example (#3417) Enrico Degregori 2025-12-14 19:25:47 +01:00
  • 5275e93a46 CK Tile: Enable padding blockscale example (#3417) Enrico Degregori 2025-12-14 19:25:47 +01:00
  • 21f06aa47d CK Tile: Enable padding blockscale example (#3417) Enrico Degregori 2025-12-14 19:25:47 +01:00
  • 179f0e857e Rename WarpTile in fwd setting Qianfeng Zhang 2025-12-14 16:21:54 +00:00
  • 125934a966 Simplifying the codes in defining KDram and QDram tile distribution Qianfeng Zhang 2025-12-14 13:50:49 +00:00
  • e1694a9547 Fix splitk Enrico Degregori 2025-12-14 12:04:58 +00:00
  • 1ab5e9da93 Tiny update in GetMaxVectorSize() Qianfeng Zhang 2025-12-14 04:26:30 +00:00
  • 5346923492 Merge commit '6219b12730e29c357a02177dbee6e565987fcc56' into develop assistant-librarian[bot] 2025-12-13 15:11:36 +00:00
  • 417ed79412 [CK_BUILDER] convolution testing (#3267) Robin Voetter 2025-12-13 15:33:41 +01:00
  • ccbeebe9ea [CK_BUILDER] convolution testing (#3267) Robin Voetter 2025-12-13 15:33:41 +01:00
  • 6219b12730 [CK_BUILDER] convolution testing (#3267) Robin Voetter 2025-12-13 15:33:41 +01:00
  • 76cfa34242 Merge commit '9707ddb444f42b490c73b7884babccde2988ed7e' into develop assistant-librarian[bot] 2025-12-13 00:36:51 +00:00
  • d287385933 [CK TILE GEMM STREAMK] update identifier names according to the new code style (#3348) Cong Ma 2025-12-12 17:08:26 -07:00
  • ff2362712a [CK TILE GEMM STREAMK] update identifier names according to the new code style (#3348) Cong Ma 2025-12-12 17:08:26 -07:00
  • 9707ddb444 [CK TILE GEMM STREAMK] update identifier names according to the new code style (#3348) jshumway/tensor Cong Ma 2025-12-12 17:08:26 -07:00
  • 9ad0687b28 enabling prefill shapes khushbu 2025-12-12 16:45:40 -05:00
  • d0461ef299 Experiment with notebooks to analyze build times jshumway/exp-build-analysis John Shumway 2025-12-12 15:37:45 -05:00
  • fd68c6a534 Merge commit 'b4a34371a6a075fd00e22cf589f683de5f9271e3' into develop assistant-librarian[bot] 2025-12-12 19:12:28 +00:00
  • 7cbd8b75a0 Fix compilation ab scale multi target (#3413) Enrico Degregori 2025-12-12 19:26:47 +01:00
  • 57def6fa6c Fix compilation ab scale multi target (#3413) Enrico Degregori 2025-12-12 19:26:47 +01:00
  • b4a34371a6 Fix compilation ab scale multi target (#3413) Enrico Degregori 2025-12-12 19:26:47 +01:00
  • d418c72980 Merge commit 'fc7bf0ab1c5ed28e5962681007f84a2e8d3ee051' into develop assistant-librarian[bot] 2025-12-12 18:17:09 +00:00
  • 245c274287 [CK_TILE] Port hw independent changes from internal repo to develop branch (#3301) linqunAMD 2025-12-13 01:28:37 +08:00
  • c6ab08a491 [CK_TILE] Port hw independent changes from internal repo to develop branch (#3301) linqunAMD 2025-12-13 01:28:37 +08:00
  • fc7bf0ab1c [CK_TILE] Port hw independent changes from internal repo to develop branch (#3301) linqunAMD 2025-12-13 01:28:37 +08:00
  • f9bf419b01 disable test_tile_gemm_quant_bquant_preshuffle (#3420) Illia Silin 2025-12-12 09:27:12 -08:00
  • 1f2421c944 disable test_tile_gemm_quant_bquant_preshuffle (#3420) Illia Silin 2025-12-12 09:27:12 -08:00
  • 9869641324 disable test_tile_gemm_quant_bquant_preshuffle (#3420) Illia Silin 2025-12-12 09:27:12 -08:00
  • f79a29ac80 Rename and add scripts for testing hdim96 Qianfeng Zhang 2025-12-12 15:23:01 +00:00
  • b3d54477f1 Enable hdim96 instances Qianfeng Zhang 2025-12-12 14:54:11 +00:00
  • 3361cfd1bf Perf analysis scripts. Ville Pietilä 2025-12-12 10:33:32 -05:00
  • 43a1f818d6 fix compile error KenSCLin 2025-12-12 13:29:11 +00:00
  • f4419af2c5 Fix typo Enrico Degregori 2025-12-12 13:54:32 +00:00
  • d40fb754bf fix compile error KenSCLin 2025-12-12 13:29:11 +00:00
  • fa19112a68 Improve benchmarking script. Ville Pietilä 2025-12-12 08:09:35 -05:00
  • 5c9f869776 WIP samremes/quantize_in_ab_scale_gemm Sami Remes 2025-12-12 07:01:02 -05:00
  • e41b818b9f Fix gridwise gemm Enrico Degregori 2025-12-12 11:41:54 +00:00
  • 9d87cfec15 Fix gridwise common Enrico Degregori 2025-12-12 11:40:20 +00:00
  • 9c7f272a6b Fix compilation error Enrico Degregori 2025-12-12 11:40:00 +00:00
  • df75061576 Restore example tolerance calculation Enrico Degregori 2025-12-12 11:17:31 +00:00
  • a87256a676 Remove autodeduce 1 stage Enrico Degregori 2025-12-12 10:35:30 +00:00
  • d5cef00770 Merge branch 'develop' into ckTileEnginePooling Aleksander Dudek 2025-12-12 10:33:06 +00:00
  • 0f1bb0e817 Fix gridwise ab scale Enrico Degregori 2025-12-12 10:14:13 +00:00
  • 4a3c949753 Fix gridwise common Enrico Degregori 2025-12-12 10:11:42 +00:00
  • 29743bc0f4 Fix explicit conv bwd weight struct Enrico Degregori 2025-12-12 09:49:17 +00:00
  • 18108d0d54 Fix with regard to define stride in MakeKLdsBlockDescriptor() Qianfeng Zhang 2025-12-12 09:27:17 +00:00
  • 2ed14be8cd Merge branch 'develop' into tianxing/unified-attention Tianxing Wu 2025-12-12 09:49:40 +00:00
  • 0c67e9731a Address review comments Enrico Degregori 2025-12-12 09:49:01 +00:00
  • ca8989abf6 Fixes Tianxing Wu 2025-12-12 09:43:23 +00:00
  • 3ea94e540b Merge branch 'develop' into streamhpc/conv_bwd_weight_wmma Enrico Degregori 2025-12-12 08:42:36 +00:00
  • 5d69875f90 Use correct workspace stride Graner, Johannes 2025-12-12 03:38:46 -05:00
  • ffad9c3e8f Fix copyright Enrico Degregori 2025-12-12 08:40:44 +00:00
  • 030cf6e5a0 clang-format for a8w8_moe_blk_gemm1 splitk change oscar 2025-12-12 15:57:50 +08:00
  • eed3c5a11d Merge branch 'ck_moe_bs_splitk' into ck_moe_bs_splitk_pr oscar 2025-12-12 15:36:18 +08:00
  • 1530e9a9dc add a4w4 moe so/moe_a4w4 solin 2025-12-12 06:19:20 +00:00
  • 8ecb5dd922 Merge commit '8d7a4e0c73e1d2741fecea200f14bda1dcacc8f7' into develop assistant-librarian[bot] 2025-12-12 05:14:30 +00:00
  • b4d5a50216 Bump rocm-docs-core[api_reference] from 1.31.0 to 1.31.1 in /docs/sphinx (#3410) dependabot[bot] 2025-12-11 21:09:40 -08:00
  • 9b0b3bd4cd Bump rocm-docs-core[api_reference] from 1.31.0 to 1.31.1 in /docs/sphinx (#3410) dependabot[bot] 2025-12-11 21:09:40 -08:00
  • 8d7a4e0c73 Bump rocm-docs-core[api_reference] from 1.31.0 to 1.31.1 in /docs/sphinx (#3410) dependabot[bot] 2025-12-11 21:09:40 -08:00
  • 44aaaacbec formatting khushbu 2025-12-11 22:29:23 -05:00
  • 2730025a98 Enable mixed mx flatmm Ding, Yi 2025-12-11 07:55:34 +00:00
  • 3118a17dc0 fix a bank conflict Ding, Yi 2025-12-10 08:35:04 +00:00
  • dc963eb359 Refactor policy Ding, Yi 2025-12-10 06:14:21 +00:00
  • b964d752f0 Merge commit '4011dbfec31a711aaa4c1071c31bdc55f9b7974a' into develop assistant-librarian[bot] 2025-12-11 23:13:32 +00:00
  • 2ac57c22c1 [CK-Tile] fixup codegen for tile engine ops gemm multid and gemm preshuffle (#3383) Max Podkorytov 2025-12-11 14:23:43 -08:00
  • 1381fdecb8 [CK-Tile] fixup codegen for tile engine ops gemm multid and gemm preshuffle (#3383) Max Podkorytov 2025-12-11 14:23:43 -08:00
  • 4011dbfec3 [CK-Tile] fixup codegen for tile engine ops gemm multid and gemm preshuffle (#3383) jshumway/transform Max Podkorytov 2025-12-11 14:23:43 -08:00
  • 995d1a5cf6 resolving merge ocnflicts khushbu 2025-12-11 14:44:08 -05:00
  • f7eba31069 Merge commit 'ff194a427129beabd419904ee173c221bcc2a5e5' into develop assistant-librarian[bot] 2025-12-11 19:37:59 +00:00
  • 92cbe3c17d fixing the tile window khushbu 2025-12-11 14:34:57 -05:00
  • 5d5dbdfb0d build: Hot fix to reduce massive build time by just disabling the instances (#3408) Aviral Goel 2025-12-11 22:39:20 +04:00
  • de59e393f6 build: Hot fix to reduce massive build time by just disabling the instances (#3408) Aviral Goel 2025-12-11 22:39:20 +04:00
  • ff194a4271 build: Hot fix to reduce massive build time by just disabling the instances (#3408) Aviral Goel 2025-12-11 22:39:20 +04:00
  • 32faf7b8e3 chore: add copyright to pass the CI (#3407) Aviral Goel 2025-12-11 22:34:15 +04:00
  • 182677e314 chore: add copyright to pass the CI (#3407) Aviral Goel 2025-12-11 22:34:15 +04:00
  • 45c4ea510c chore: add copyright to pass the CI (#3407) Aviral Goel 2025-12-11 22:34:15 +04:00
  • 112b5ecf6b fix pre-commit error KenSCLin 2025-12-11 17:42:06 +00:00
  • 00aa34fd7c fix pre-commit error KenSCLin 2025-12-11 17:42:06 +00:00
  • f9ad462542 Merge commit '4dcc3e59c1c0195dae7ee9da9ab76d18a4cafe9f' into develop assistant-librarian[bot] 2025-12-11 17:17:01 +00:00
  • 2debb6ca08 Merge branch 'develop' into ck_tile/gemm_blockscale_abquant kensclin 2025-12-12 01:06:46 +08:00
  • c2bb4d261f Add unit tests for blockscale AB-Quantization KenSCLin 2025-12-11 16:47:09 +00:00
  • c62440f7ca WIP Sami Remes 2025-12-11 16:33:38 +00:00
  • f2a25da322 chore: update copyright header for misc files (#3402) Aviral Goel 2025-12-11 20:25:29 +04:00
  • cb629a747f chore: update copyright header for misc files (#3402) Aviral Goel 2025-12-11 20:25:29 +04:00
  • 4dcc3e59c1 chore: update copyright header for misc files (#3402) Aviral Goel 2025-12-11 20:25:29 +04:00
  • 907d070ad6 WIP Sami Remes 2025-12-11 16:21:37 +00:00
  • 0566c90f66 Merge branch 'develop' into streamhpc/conv_bwd_weight_wmma Enrico Degregori 2025-12-11 16:13:05 +00:00
  • f55ff25622 Fix compilation errors with latest clang22 version. (#3396) Illia Silin 2025-12-11 08:09:29 -08:00
  • f40258cb82 Fix compilation errors with latest clang22 version. (#3396) Illia Silin 2025-12-11 08:09:29 -08:00
  • b2925ee207 Fix compilation errors with latest clang22 version. (#3396) Illia Silin 2025-12-11 08:09:29 -08:00