Commit Graph

  • d074af36c9 Implement grouped gemm fastgelu for RDNA4 (#3303) Erwin Terpstra 2026-01-07 19:20:44 +01:00
  • 2379b5e6e0 Implement grouped gemm fastgelu for RDNA4 (#3303) Erwin Terpstra 2026-01-07 19:20:44 +01:00
  • f9c6ba0403 Implement grouped gemm fastgelu for RDNA4 (#3303) Erwin Terpstra 2026-01-07 19:20:44 +01:00
  • 9af4498194 Remove the defaults for SrcDataType and DstDataType in GemmPipelineAgBgCrImplBase::GlobalPrefetch Sami Aario 2026-01-07 13:48:38 +00:00
  • 9633d3f5bb In GetAWindows and GetBWindows, use DataType from LDS tensor view Sami Aario 2025-12-17 14:41:31 +00:00
  • 9559a93432 Make explicit that the tile window argument to load_tile_with_elementwise and the two load methods it uses are tuples Sami Aario 2025-12-12 09:53:26 +00:00
  • cfa11f2d1f Rename InterleavedPKTypeLoader to ConverterLoader, and load_int4_tile to load_and_convert_tile Sami Aario 2025-11-27 08:35:18 +00:00
  • 3a094e2f8b Include ck_tile/core.hpp in load_interleaved_pk_type.hpp for better IDE integration Sami Aario 2025-11-26 11:33:27 +00:00
  • 74533b4755 Rename load_interleaved_pk_type to load_and_convert_tile Sami Aario 2025-11-27 09:28:27 +00:00
  • 994b8f4c22 Minor refactoring of load_interleaved_pk_type Sami Aario 2025-11-12 08:34:13 +00:00
  • ca71cd75fc Reduce the scope of KPack in MakeALdsBlockDescriptor Sami Aario 2025-12-17 14:52:24 +00:00
  • 825d17c3d7 Fix a comment Sami Aario 2025-12-11 12:32:52 +00:00
  • bda5a7aa2d Add braces Sami Aario 2025-11-19 09:08:27 +00:00
  • 969156985b Use decltype for consistency in Interwave variant of BlockGemmImpl Sami Aario 2025-11-21 10:53:14 +00:00
  • 4d77856be5 Make some functions return void explicitly instead of auto Sami Aario 2025-12-06 17:17:16 +00:00
  • 54e7d86ee2 Merge commit 'a7d6b1e7008c0b6e1af8a7d79389aefbdca4da65' into develop assistant-librarian[bot] 2026-01-07 16:16:37 +00:00
  • 6f6256381a Add unit test coverage for conversion to convolution traits (#3515) John Shumway 2026-01-07 07:44:21 -08:00
  • a89756823c Add unit test coverage for conversion to convolution traits (#3515) John Shumway 2026-01-07 07:44:21 -08:00
  • a7d6b1e700 Add unit test coverage for conversion to convolution traits (#3515) John Shumway 2026-01-07 07:44:21 -08:00
  • 2273f06ad6 [CI, CK examples] Disable time_kernel for CI tests and examples (#3464) Johannes Graner 2026-01-07 16:30:57 +01:00
  • acf98936bc [CI, CK examples] Disable time_kernel for CI tests and examples (#3464) Johannes Graner 2026-01-07 16:30:57 +01:00
  • 0a474aa62f [CI, CK examples] Disable time_kernel for CI tests and examples (#3464) Johannes Graner 2026-01-07 16:30:57 +01:00
  • 850997ff67 Merge commit 'e8cc75aefbe365750cf79c1188014325578941d8' into develop assistant-librarian[bot] 2026-01-07 15:15:08 +00:00
  • 83cc424e31 Enable offload-compress for Windows if avaliable (#3521) BrianHarrisonAMD 2026-01-07 08:05:03 -07:00
  • edc3e4a870 Enable offload-compress for Windows if avaliable (#3521) BrianHarrisonAMD 2026-01-07 08:05:03 -07:00
  • e8cc75aefb Enable offload-compress for Windows if avaliable (#3521) BrianHarrisonAMD 2026-01-07 08:05:03 -07:00
  • 5c0965184e estimate vgpr qlin/port_gfx11_build_refine Qun Lin 2026-01-07 19:30:21 +08:00
  • 7b3aca7878 Merge remote-tracking branch 'origin/vpietila/ckb-bwd-weight-factories' into vpietila/ckb-refactor-warp-gemm-descriptors Ville Pietilä 2026-01-07 06:16:43 -05:00
  • d107b851c1 Merge branch 'develop' into vpietila/ckb-bwd-weight-factories Ville Pietilä 2026-01-07 02:43:24 -08:00
  • c5cdd51ce4 Fix factory for regular WMMA conv bwd weight. Ville Pietilä 2026-01-07 05:41:22 -05:00
  • 00f45cca2e clang-format Ville Pietilä 2026-01-07 03:58:09 -05:00
  • 82803221c3 Fix smoke tests. Ville Pietilä 2026-01-07 03:57:02 -05:00
  • 37e9547a29 Fix ref algorithm dispatching. Ville Pietilä 2026-01-07 03:56:50 -05:00
  • bb614ee8b2 Merge commit 'd7497d26948ca90d0224920472712e0f657fb744' into develop assistant-librarian[bot] 2026-01-07 08:16:44 +00:00
  • 026c9200ee [CK TILE] Refactor function amd_buffer_load_invalid_element_return_zero (#3512) Cong Ma 2026-01-07 01:05:56 -07:00
  • cdd9dafe6a [CK TILE] Refactor function amd_buffer_load_invalid_element_return_zero (#3512) Cong Ma 2026-01-07 01:05:56 -07:00
  • d7497d2694 [CK TILE] Refactor function amd_buffer_load_invalid_element_return_zero (#3512) Cong Ma 2026-01-07 01:05:56 -07:00
  • c406ae06aa Migrate to new instance-specific instance_to_conv_traits functions John Shumway 2026-01-06 23:05:46 -05:00
  • 73b1e59c75 Create a non-templated conv_traits struct John Shumway 2026-01-06 12:50:42 -05:00
  • 1260f443b4 Factor helpers out of conv_traits.hpp John Shumway 2026-01-06 12:31:06 -05:00
  • ee31554646 Add unit test coverage for conversion to convolution traits John Shumway 2026-01-05 22:02:51 -05:00
  • 9ec8eac079 Merge commit 'aaa35f0bbfa45dadc4380ddd6e0224668ddb97b4' into develop assistant-librarian[bot] 2026-01-06 21:12:56 +00:00
  • 945b165d47 [CK_Tile] Support for various group sizes Preshuffle quant for 2d block scale gemm (#3445) Khushbu Agarwal 2026-01-06 12:46:59 -08:00
  • c33704febc [CK_Tile] Support for various group sizes Preshuffle quant for 2d block scale gemm (#3445) Khushbu Agarwal 2026-01-06 12:46:59 -08:00
  • aaa35f0bbf [CK_Tile] Support for various group sizes Preshuffle quant for 2d block scale gemm (#3445) Khushbu Agarwal 2026-01-06 12:46:59 -08:00
  • 27de2f8fc8 [CKTILE] Support A/B Quantization in Blockscale Grouped Gemm (#3452) kyle-256 2026-01-07 04:36:04 +08:00
  • 9489e197c3 [CKTILE] Support A/B Quantization in Blockscale Grouped Gemm (#3452) kyle-256 2026-01-07 04:36:04 +08:00
  • 76696ace44 [CKTILE] Support A/B Quantization in Blockscale Grouped Gemm (#3452) kyle-256 2026-01-07 04:36:04 +08:00
  • c30f18927f [CK_TILE] add preshuffleB mode for ABQuant GEMM (#3495) kensclin 2026-01-07 04:35:01 +08:00
  • df198bd5af [CK_TILE] add preshuffleB mode for ABQuant GEMM (#3495) kensclin 2026-01-07 04:35:01 +08:00
  • 2309c86054 [CK_TILE] add preshuffleB mode for ABQuant GEMM (#3495) kensclin 2026-01-07 04:35:01 +08:00
  • 05b2660bf1 Merge commit '960ef551bf5d615d45e31b954e0faff147e76c85' into develop assistant-librarian[bot] 2026-01-06 19:12:05 +00:00
  • a2e2b2a59d Fix build error from extra comma (#3516) John Shumway 2026-01-06 11:08:54 -08:00
  • 946a6e7df0 Fix build error from extra comma (#3516) John Shumway 2026-01-06 11:08:54 -08:00
  • 960ef551bf Fix build error from extra comma (#3516) John Shumway 2026-01-06 11:08:54 -08:00
  • 38f334d882 Merge commit '2ffbf7f476d99b6fc3db71480b49d221c602e071' into develop assistant-librarian[bot] 2026-01-06 18:17:10 +00:00
  • 90a2126d1f add tabulate package to aiter docker (#3519) Illia Silin 2026-01-06 09:36:54 -08:00
  • acb2292b46 add tabulate package to aiter docker (#3519) Illia Silin 2026-01-06 09:36:54 -08:00
  • 2ffbf7f476 add tabulate package to aiter docker (#3519) Illia Silin 2026-01-06 09:36:54 -08:00
  • 5d0010c4b9 Merge commit '1c433c64ec5254d202b7cbf4b8b0e98678ea2a4f' into develop assistant-librarian[bot] 2026-01-06 09:16:30 +00:00
  • b3918fe248 [CK_BUILDER] Integrate reference conv with testing (#3511) Robin Voetter 2026-01-06 09:29:06 +01:00
  • ffc30531ac [CK_BUILDER] Integrate reference conv with testing (#3511) Robin Voetter 2026-01-06 09:29:06 +01:00
  • 1c433c64ec [CK_BUILDER] Integrate reference conv with testing (#3511) Robin Voetter 2026-01-06 09:29:06 +01:00
  • 2285a8345a Merge commit 'b78563b3d3edf1b2cd686ff0c0994ca2538419ef' into develop assistant-librarian[bot] 2026-01-06 08:16:41 +00:00
  • 00d05ab32e Merge some updates for ck_tile headers (#3342) joyeamd 2026-01-06 15:39:00 +08:00
  • e36567f015 Merge some updates for ck_tile headers (#3342) joyeamd 2026-01-06 15:39:00 +08:00
  • b78563b3d3 Merge some updates for ck_tile headers (#3342) joyeamd 2026-01-06 15:39:00 +08:00
  • 6fdd14c52d copyright boilerplate enable_persistent_async Max Podkorytov 2026-01-05 18:49:29 -06:00
  • c9c551221f copyright boilerplate Max Podkorytov 2026-01-05 18:48:43 -06:00
  • 656675f0ec Merge remote-tracking branch 'origin/develop' into enable_persistent_async Max Podkorytov 2026-01-05 18:45:54 -06:00
  • 1783fc0ab9 copyright boilerplate Max Podkorytov 2026-01-05 18:43:22 -06:00
  • b8bbd3710c clang-format Max Podkorytov 2026-01-05 18:42:22 -06:00
  • 4c0afcd71e fix build Max Podkorytov 2026-01-05 18:21:35 -06:00
  • 3f746f7294 Merge commit '2b563ad04828c5c970f7544d49831f33203587fb' into develop assistant-librarian[bot] 2026-01-05 22:13:10 +00:00
  • 03c030203e fix file permissions Max Podkorytov 2026-01-05 16:07:58 -06:00
  • 3b5f2b2d99 Joye/revise wp pipeline (#3493) joyeamd 2026-01-06 05:49:26 +08:00
  • 9516169aaf Joye/revise wp pipeline (#3493) joyeamd 2026-01-06 05:49:26 +08:00
  • 2b563ad048 Joye/revise wp pipeline (#3493) joyeamd 2026-01-06 05:49:26 +08:00
  • 49a0466773 Merge branch 'develop' into enable_persistent_async Max Podkorytov 2026-01-05 12:50:17 -08:00
  • 3ee7a7765f Merge commit '1224bc0a82fbf47e1452bc4dbd63371471e57d4a' into develop assistant-librarian[bot] 2026-01-05 18:17:32 +00:00
  • 32e805b853 Add support to gfx1153 and fix gfx115X WMMA config (#3496) Estevan Vedovelli 2026-01-05 13:03:30 -05:00
  • 604ba0e9cf Add support to gfx1153 and fix gfx115X WMMA config (#3496) Estevan Vedovelli 2026-01-05 13:03:30 -05:00
  • 1224bc0a82 Add support to gfx1153 and fix gfx115X WMMA config (#3496) Estevan Vedovelli 2026-01-05 13:03:30 -05:00
  • e26a264f70 Fix large tensor grouped conv bwd data test (#3513) Bartłomiej Kocot 2026-01-05 18:42:02 +01:00
  • 502914e556 Fix large tensor grouped conv bwd data test (#3513) Bartłomiej Kocot 2026-01-05 18:42:02 +01:00
  • bbf0b1a3b3 Fix large tensor grouped conv bwd data test (#3513) Bartłomiej Kocot 2026-01-05 18:42:02 +01:00
  • d4321120f6 Add build trace visualization. John Shumway 2026-01-05 12:06:13 -05:00
  • f426c7ecc6 Add optimized kernel validation against GPU reference jeonghyun/ckb-add-optimized-kernel-validation JH-Leon-KIM-AMD 2026-01-05 16:41:02 +00:00
  • 02243cabe6 Merge branch 'develop' into vpietila/ckb-bwd-weight-factories Ville Pietilä 2026-01-05 07:07:01 -08:00
  • 5f639559a1 WIP: Unify warp GEMM and thread distribution descriptions. Ville Pietilä 2026-01-05 09:52:46 -05:00
  • 7ef22db454 Merge commit 'e6e7dc29101bcd8a5d30ae99adf71a09fa544b09' into develop assistant-librarian[bot] 2026-01-05 13:26:54 +00:00
  • 98125fd6b4 [CK_BUILDER] validation (#3471) Robin Voetter 2026-01-05 13:57:34 +01:00
  • 14a149bab6 [CK_BUILDER] validation (#3471) Robin Voetter 2026-01-05 13:57:34 +01:00
  • e6e7dc2910 [CK_BUILDER] validation (#3471) Robin Voetter 2026-01-05 13:57:34 +01:00
  • 85642a59c2 Merge commit 'cc75a1dc5f18613af29d8821375f79b0f3c6410b' into develop assistant-librarian[bot] 2026-01-05 11:13:10 +00:00
  • fd84daec4c [FMHA] Batch Prefill Support Improvements: Change KV Cache Layout & Large Page Size Support (#3442) Jeff Huang 2026-01-05 18:41:47 +08:00
  • 4f3995a3e3 [FMHA] Batch Prefill Support Improvements: Change KV Cache Layout & Large Page Size Support (#3442) Jeff Huang 2026-01-05 18:41:47 +08:00
  • cc75a1dc5f [FMHA] Batch Prefill Support Improvements: Change KV Cache Layout & Large Page Size Support (#3442) Jeff Huang 2026-01-05 18:41:47 +08:00
  • 201039646e Move compile-time diagnostics to a separate branch. Ville Pietilä 2026-01-05 05:32:09 -05:00
  • 881bf916fe clang-format vpietila/ckb-improve-compile-time-errors Ville Pietilä 2026-01-05 04:45:51 -05:00