Commit Graph

  • 9581558692 remove redundant 01_unified_attention and fix cmake Juuso Korhonen 2025-11-11 14:29:46 +00:00
  • 100dcc9ea2 add qkv scale all ltqin 2025-11-11 13:56:25 +00:00
  • 4556720057 fix cmakelist Juuso Korhonen 2025-11-11 13:00:28 +00:00
  • 51ec0c2ec1 merge with Cong's Changes khuagarw 2025-11-11 03:16:08 +00:00
  • e1bc48d5f3 correct result but using same scale ltqin 2025-11-11 00:50:23 +00:00
  • 1e2dac15a1 save tmp tenpercent/gfx950_lds_experiments Max Podkorytov 2025-11-10 13:11:13 -06:00
  • 0b000816a4 Merge commit '9f33b7cfd3df3fcfd540f7633b0abd7019935761' into develop assistant-librarian[bot] 2025-11-10 19:12:32 +00:00
  • b40859d461 fix input range (#3188) Thomas Ning 2025-11-10 11:08:41 -08:00
  • fdccd7a3b4 fix input range (#3188) Thomas Ning 2025-11-10 11:08:41 -08:00
  • 9f33b7cfd3 fix input range (#3188) Thomas Ning 2025-11-10 11:08:41 -08:00
  • ddb0078fec [ck] Enable missing op for gfx11 and gfx12 (#3187) linqunAMD 2025-11-11 02:58:20 +08:00
  • 89b798620c [ck] Enable missing op for gfx11 and gfx12 (#3187) linqunAMD 2025-11-11 02:58:20 +08:00
  • 7b6ba8d5c2 [ck] Enable missing op for gfx11 and gfx12 (#3187) linqunAMD 2025-11-11 02:58:20 +08:00
  • 93b4c77e06 [ck] correct memory size in grouped_gemm_multi_abd_xdl_fixed_nk_bias_bf16_i8 (#3168) linqunAMD 2025-11-11 02:58:08 +08:00
  • 27df389d70 [ck] correct memory size in grouped_gemm_multi_abd_xdl_fixed_nk_bias_bf16_i8 (#3168) linqunAMD 2025-11-11 02:58:08 +08:00
  • e593a14ae1 [ck] correct memory size in grouped_gemm_multi_abd_xdl_fixed_nk_bias_bf16_i8 (#3168) linqunAMD 2025-11-11 02:58:08 +08:00
  • 5f9d5566e5 [CK-Tile] Add gtests for compiler CI for faster testing (#3123) Manish Kumar 2025-11-11 00:12:23 +05:30
  • 045a8ca2ff [CK-Tile] Add gtests for compiler CI for faster testing (#3123) Manish Kumar 2025-11-11 00:12:23 +05:30
  • d5746dd120 [CK-Tile] Add gtests for compiler CI for faster testing (#3123) Manish Kumar 2025-11-11 00:12:23 +05:30
  • 9debcc1a55 [CK TILE GEMM] Refactor block_scale_gemm examples Cong Ma 2025-11-10 12:34:38 -05:00
  • 8f876f094e Simplify the codes in block_gemm_areg_bsmem_creg_v2_hack_1 Qianfeng Zhang 2025-11-10 15:52:12 +00:00
  • 303818a851 Simplify the codes in block_gemm_areg_bsmem_trload_creg Qianfeng Zhang 2025-11-10 15:27:34 +00:00
  • 6ba822688a test clang format Tianxing Wu 2025-11-10 13:52:13 +00:00
  • 51a4ae44ef Post-merge fixes. Make sure the new gridwise gemm wmma v3 common Run function can be used. Remove splitK, and forceThreadTileTransfer for now. Also add CShuffle epilogue argument. kiefer 2025-11-10 13:13:04 +00:00
  • a09bd71942 Update include/ck_tile/ops/gemm/kernel/gemm_tile_partitioner.hpp Tianxing Wu 2025-11-10 14:18:57 +02:00
  • 33e8a5761f Merge branch 'develop' into moe_xcd_remap Tianxing Wu 2025-11-10 14:16:51 +02:00
  • 47c9d0a131 More compilation fix Tianxing Wu 2025-11-10 11:34:52 +00:00
  • 5cc470b167 Merge remote-tracking branch 'origin/develop' into 65-grouped-conv-fwd-wmma kiefer 2025-11-10 08:55:07 +00:00
  • 8818018d0c [CK TILE GEMM] Refactor block_scale_gemm examples Cong Ma 2025-11-07 23:03:54 -05:00
  • c553c87861 enable prefill shapes khuagarw 2025-11-07 20:32:44 +00:00
  • 23a5127a41 update comprehensive tests Mohsen Saffari 2025-11-07 17:45:39 +00:00
  • bc26224ce1 [CK TILE GEMM] Refactor block_scale_gemm examples Cong Ma 2025-11-07 12:13:45 -05:00
  • 625ce4b77c implement script to run comprehensive combinations with flatmm_moe example to find the issues Mohsen Saffari 2025-11-07 16:52:14 +00:00
  • a967b4906b [CK TILE GEMM] Refactor block_scale_gemm examples Cong Ma 2025-11-06 19:31:58 -05:00
  • 5c938eacc3 temp yiltan-temp Bartlomiej Kocot 2025-11-07 08:22:03 -06:00
  • e9198da7c1 complish code(need debug) ltqin 2025-11-07 13:07:50 +00:00
  • 1d3304ab9e revert change on fmha Tianxing Wu 2025-11-07 12:31:31 +00:00
  • 8e7de8d5e3 Merge branch 'develop' into tianxing/unified-attention Tianxing Wu 2025-11-07 12:29:17 +00:00
  • d4a419c0c9 Introduce inheritance and specialization. vpietila/ckb-remove-explicit-device-op-flag Ville Pietilä 2025-11-07 12:13:37 +00:00
  • c959c1117a Clean-up recently added instances. Ville Pietilä 2025-11-07 11:02:41 +00:00
  • ea9d100d4a Merge branch 'vpietila/ckb-fwd-instance-test-improvements' into vpietila/ckb-remove-explicit-device-op-flag Ville Pietilä 2025-11-07 09:32:58 +00:00
  • 8028ff6d93 Add missing header to instance traits. vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-07 09:22:30 +00:00
  • 9fb1d16ae9 Merge remote-tracking branch 'origin/develop' into vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-07 09:16:39 +00:00
  • 650109a348 Merge commit 'e31a7a4f29b371c32ea9daf9211b6ae1fed2fa40' into develop assistant-librarian[bot] 2025-11-07 04:14:29 +00:00
  • 0344170dac fix MX bpreshuffle gemm B grid descriptor dimension error. (#3170) Gino Lu 2025-11-07 11:42:39 +08:00
  • 89a665e60e fix MX bpreshuffle gemm B grid descriptor dimension error. (#3170) Gino Lu 2025-11-07 11:42:39 +08:00
  • e31a7a4f29 fix MX bpreshuffle gemm B grid descriptor dimension error. (#3170) Gino Lu 2025-11-07 11:42:39 +08:00
  • 4c67bf8aaf Merge commit 'd04eba4ae37c8c2d40855f02aa861e1ac1ec7b3f' into develop assistant-librarian[bot] 2025-11-07 01:40:22 +00:00
  • e5d6a8091f formatting khuagarw 2025-11-07 01:27:59 +00:00
  • 6e40562dff Ck moe mxfp4 blockm32 (#3098) Xudong Yuan 2025-11-07 08:45:41 +08:00
  • a8dbac6470 Ck moe mxfp4 blockm32 (#3098) Xudong Yuan 2025-11-07 08:45:41 +08:00
  • d04eba4ae3 Ck moe mxfp4 blockm32 (#3098) Xudong Yuan 2025-11-07 08:45:41 +08:00
  • d1d568c17b Merge commit '5f3cae3e28a042e411afcd2e54b16cc6909c5bbb' into develop assistant-librarian[bot] 2025-11-07 00:36:11 +00:00
  • e8afef1e8b [CK_BUILDER]ckb add remining fwd conv device ops (#3155) JH-Leon-KIM-AMD 2025-11-07 02:29:48 +02:00
  • 4fbe5ee525 [CK_BUILDER]ckb add remining fwd conv device ops (#3155) JH-Leon-KIM-AMD 2025-11-07 02:29:48 +02:00
  • 5f3cae3e28 [CK_BUILDER]ckb add remining fwd conv device ops (#3155) JH-Leon-KIM-AMD 2025-11-07 02:29:48 +02:00
  • 63d8864858 Merge commit '76c4c12f5959adcd56d1627a1d1ce885deb9d096' into develop assistant-librarian[bot] 2025-11-06 23:12:25 +00:00
  • 085690955f Add .clangd and CMakeUserPresets.json to .gitignore (#3171) Johannes Graner 2025-11-07 00:07:39 +01:00
  • cd334376dc Add .clangd and CMakeUserPresets.json to .gitignore (#3171) Johannes Graner 2025-11-07 00:07:39 +01:00
  • 76c4c12f59 Add .clangd and CMakeUserPresets.json to .gitignore (#3171) Johannes Graner 2025-11-07 00:07:39 +01:00
  • cb20485d00 Merge commit '18e083003fa25a661015542c39b1979200f361cf' into develop assistant-librarian[bot] 2025-11-06 15:13:08 +00:00
  • 7f1592c0e2 Enable compilation of FP8 instances. Ville Pietilä 2025-11-06 15:00:36 +00:00
  • e99e5c35fd Merge remote-tracking branch 'origin/vpietila/ckb-remove-explicit-device-op-flag' into vpietila/ckb-fwd-bwd-instances Ville Pietilä 2025-11-06 08:53:54 -06:00
  • 64ebfd2345 Fix scale instances. Ville Pietilä 2025-11-06 14:50:02 +00:00
  • 9fde8e559a [CK_BUILDER] Convolution description (#3163) Adam Osewski 2025-11-06 15:46:26 +01:00
  • 3e184d3b67 [CK_BUILDER] Convolution description (#3163) Adam Osewski 2025-11-06 15:46:26 +01:00
  • 18e083003f [CK_BUILDER] Convolution description (#3163) Adam Osewski 2025-11-06 15:46:26 +01:00
  • 4c7c133721 Add test for building conv fwd FP8 instances. Ville Pietilä 2025-11-06 14:30:16 +00:00
  • 0aea348865 add group parameters for block quant ltqin 2025-11-06 12:32:36 +00:00
  • 59c7cafc60 Merge branch 'develop' into vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-06 13:48:06 +02:00
  • 78783a456c Merge commit '2234ff830b2f4ce8026c50b2d81f95f38f7117e5' into develop assistant-librarian[bot] 2025-11-06 11:12:13 +00:00
  • e89cb52306 [CK TILE] Convolution remove magic values (#3160) Bartłomiej Kocot 2025-11-06 11:26:30 +01:00
  • 5c219f1697 [CK TILE] Convolution remove magic values (#3160) Bartłomiej Kocot 2025-11-06 11:26:30 +01:00
  • 2234ff830b [CK TILE] Convolution remove magic values (#3160) Bartłomiej Kocot 2025-11-06 11:26:30 +01:00
  • 0c1c86a72e Fix linking for the shared lib. Ville Pietilä 2025-11-06 10:18:47 +00:00
  • d4d1399dac Disable three instances that cannot be built right now. Ville Pietilä 2025-11-06 10:18:32 +00:00
  • 009512536c Update code gen after pipeline version changes. Ville Pietilä 2025-11-06 08:40:38 +00:00
  • 7a63cfd6c4 Merge remote-tracking branch 'origin/vpietila/ckb-remove-explicit-device-op-flag' into vpietila/ckb-fwd-bwd-instances Ville Pietilä 2025-11-06 02:30:30 -06:00
  • beb1165d0d Merge branch 'vpietila/ckb-fwd-instance-test-improvements' into vpietila/ckb-remove-explicit-device-op-flag Ville Pietilä 2025-11-06 08:27:07 +00:00
  • 13b980c418 clang-format Ville Pietilä 2025-11-06 08:26:05 +00:00
  • bd0444f365 [Performance] Change the tile settings for mi350/trload no_softmax pipeline to enable to use mfma-16x16x32 for Gemm-1 Qianfeng Zhang 2025-11-06 08:20:11 +00:00
  • 9c64167d6f Merge remote-tracking branch 'origin/develop' into vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-06 08:19:57 +00:00
  • 9b341c5d6f add batch block scale parameters to kernel ltqin 2025-11-06 08:01:41 +00:00
  • cd3b8ae564 Merge commit '12922120d2567c3512048d7e8ed37e387a07bab6' into develop assistant-librarian[bot] 2025-11-06 07:13:12 +00:00
  • 846b43f43b add gfx11's barrier following SPG's reference (#3159) joyeamd 2025-11-06 14:29:03 +08:00
  • ee21c7b651 add gfx11's barrier following SPG's reference (#3159) joyeamd 2025-11-06 14:29:03 +08:00
  • 12922120d2 add gfx11's barrier following SPG's reference (#3159) joyeamd 2025-11-06 14:29:03 +08:00
  • 8c8f4b47ec run clang-format shahamed/ck3129 Kevin Abraham 2025-10-30 21:17:20 +00:00
  • 4aeb73c616 fixed synchronization issue in block gemm pipeline v1 that caused b_scale to fail Kevin Abraham 2025-10-30 21:16:02 +00:00
  • 00e61e0b01 formatting khuagarw 2025-11-06 01:53:07 +00:00
  • d9272218c4 Merge pull request #3076 from ROCm/congma/release/rocm-rel-7.1/revert-2610 rocm-7.1.1 release/rocm-rel-7.1.1.1 release/rocm-rel-7.1 JeniferC99 2025-11-05 17:17:35 -08:00
  • b3950e9d11 Merge commit '4533aa6dbab648adc1a496b6064cb79777c41cf5' into develop assistant-librarian[bot] 2025-11-06 00:35:42 +00:00
  • 8999dae5a0 formatting khuagarw 2025-11-05 23:58:21 +00:00
  • b7d6555a88 Fix compilation errors with clang22. (#3164) Illia Silin 2025-11-05 15:42:22 -08:00
  • d258a23f20 Fix compilation errors with clang22. (#3164) Illia Silin 2025-11-05 15:42:22 -08:00
  • 4533aa6dba Fix compilation errors with clang22. (#3164) Illia Silin 2025-11-05 15:42:22 -08:00
  • 4bbbfeb186 Merge commit 'b8527a92360496666ed6606e53ddc97e35dcf76e' into develop assistant-librarian[bot] 2025-11-05 17:12:47 +00:00
  • 54409e7fb5 [CK_BUILDER] Convolution traits. (#3152) Adam Osewski 2025-11-05 17:53:06 +01:00
  • f7bfb69702 [CK_BUILDER] Convolution traits. (#3152) Adam Osewski 2025-11-05 17:53:06 +01:00
  • b8527a9236 [CK_BUILDER] Convolution traits. (#3152) Adam Osewski 2025-11-05 17:53:06 +01:00