Commit Graph

  • b940a75328 Comments Tianxing Wu 2025-10-14 12:19:20 +00:00
  • 4c0b5201eb Merge commit '589e242eda730958b36c4f78bfad1991c499b0d2' into develop assistant-librarian[bot] 2025-10-14 12:17:41 +00:00
  • ec29289bb1 kv paging Tianxing Wu 2025-10-14 12:04:11 +00:00
  • 6b4d770179 Fix: Handle JSON boolean values (pad_m, pad_n, pad_k and persistent) in gemm_instance_builder (#3008) msaffari-amd 2025-10-14 13:20:25 +02:00
  • 2ad7ba80c9 Fix: Handle JSON boolean values (pad_m, pad_n, pad_k and persistent) in gemm_instance_builder (#3008) msaffari-amd 2025-10-14 13:20:25 +02:00
  • 589e242eda Fix: Handle JSON boolean values (pad_m, pad_n, pad_k and persistent) in gemm_instance_builder (#3008) msaffari-amd 2025-10-14 13:20:25 +02:00
  • 290e131533 Improve build infrastructure for generating doc philipm/documentation-cleanup Philip Maybank 2025-10-14 12:03:51 +01:00
  • 22362f2599 enable 2d for reference quant gemm Sami Remes 2025-10-14 10:01:46 +00:00
  • c87f2e3ca9 o window change Tianxing Wu 2025-10-14 09:59:47 +00:00
  • 96b208f6c7 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Tianxing Wu 2025-10-14 09:58:30 +00:00
  • e1120fffb0 pipeline api Tianxing Wu 2025-10-14 09:58:27 +00:00
  • c3d27abfb8 fix q window Juuso Korhonen 2025-10-14 09:49:54 +00:00
  • b37c356090 fix q window origin Juuso Korhonen 2025-10-14 09:36:28 +00:00
  • 450d26dbe9 work on single instance generator pmaybank/tile_engine_gemm Philip Maybank 2025-10-14 09:28:36 +01:00
  • 48fdcd05a5 fix valarLip 2025-10-14 07:26:25 +00:00
  • 2072e53d1e Remove K0 from tile setting since it is not used Qianfeng Zhang 2025-10-13 16:01:50 +00:00
  • 0f6bf78caa Add empty instance factory. Ville Pietilä 2025-10-14 07:13:20 +00:00
  • 0b8f009173 load int4 tile valarLip 2025-10-14 06:45:59 +00:00
  • eaf9ba4e45 Rename CK Tile grouped conv factory. Ville Pietilä 2025-10-14 06:31:34 +00:00
  • c1a59349dc rm support of gfx11 and gfx12 valarLip 2025-10-14 02:41:02 +00:00
  • 63d907604b Merge commit 'e1b0bdfbfa92f47006fdbced627c7470eacdea2b' into develop assistant-librarian[bot] 2025-10-13 19:10:56 +00:00
  • 7907a466de [CK_TILE] Correct BlockWarps calculation and fix smoke-test in rmsnorm (#2540) ClementLinCF 2025-10-14 02:52:37 +08:00
  • 6a423df526 [CK_TILE] Correct BlockWarps calculation and fix smoke-test in rmsnorm (#2540) ClementLinCF 2025-10-14 02:52:37 +08:00
  • e1b0bdfbfa [CK_TILE] Correct BlockWarps calculation and fix smoke-test in rmsnorm (#2540) ClementLinCF 2025-10-14 02:52:37 +08:00
  • c80df7a092 Fix tile window API description Vidyasagar 2025-10-13 11:17:50 -07:00
  • bc873c1074 Updates based on feedback Vidyasagar 2025-10-13 10:49:01 -07:00
  • fc6a9e3931 Create invoker for the kernel and a factory for creating invokers. Ville Pietilä 2025-10-13 15:22:50 +00:00
  • 713691609c Merge commit 'fc2a121c4446b4ca939e977563528019b30e6114' into develop assistant-librarian[bot] 2025-10-13 15:12:25 +00:00
  • d4601123d2 Enable GMock and improve gtest configuration (#2976) John Shumway 2025-10-13 08:11:51 -07:00
  • 784f68e831 Enable GMock and improve gtest configuration (#2976) John Shumway 2025-10-13 08:11:51 -07:00
  • fc2a121c44 Enable GMock and improve gtest configuration (#2976) John Shumway 2025-10-13 08:11:51 -07:00
  • f6b07dcf79 start setting of group size for N dimension Sami Remes 2025-10-13 15:08:10 +00:00
  • 98365f5aa6 add some asserts for configurations not implemented Sami Remes 2025-10-13 14:32:32 +00:00
  • a60dab521e Added a placeholder conv bwd instance factory for CK Tile profiler. Ville Pietilä 2025-10-13 14:32:20 +00:00
  • 8bb5255526 Refactor quant group size to be configurable for M/N/K, not just K Sami Remes 2025-10-13 14:05:39 +00:00
  • 1fdfe40874 Merge commit 'd2bbca3eca2bd14014e3daae39ae70846ec8218b' into develop assistant-librarian[bot] 2025-10-13 13:20:32 +00:00
  • 6dcee56fee WIP: CK Tile conv bwd profiler. Ville Pietilä 2025-10-13 13:03:21 +00:00
  • 6a7fa959b7 kv tensor view and initial window Tianxing Wu 2025-10-13 12:53:43 +00:00
  • 4426784f38 [CK_TILE] Non-K Major from old CK to CK-Tile (#2442) Sami Remes 2025-10-13 13:27:02 +01:00
  • 985b9f98a2 [CK_TILE] Non-K Major from old CK to CK-Tile (#2442) Sami Remes 2025-10-13 13:27:02 +01:00
  • d2bbca3eca [CK_TILE] Non-K Major from old CK to CK-Tile (#2442) Sami Remes 2025-10-13 13:27:02 +01:00
  • 9bbf7016b6 Merge commit '634634f5c09a3b42f5f838a5af9c948602e246db' into develop assistant-librarian[bot] 2025-10-13 12:17:18 +00:00
  • bf0a5cbb11 [CK_TILE] Blockwise GEMM pipeline v6 - port of v5 from old CK (#2955) aledudek 2025-10-13 13:57:37 +02:00
  • ab7f67488c [CK_TILE] Blockwise GEMM pipeline v6 - port of v5 from old CK (#2955) aledudek 2025-10-13 13:57:37 +02:00
  • 634634f5c0 [CK_TILE] Blockwise GEMM pipeline v6 - port of v5 from old CK (#2955) aledudek 2025-10-13 13:57:37 +02:00
  • f1c8acbd71 [CK_TILE] Batched Gemm Kernel IsSupported function checks (#2860) aledudek 2025-10-13 13:55:23 +02:00
  • e42f27e42e [CK_TILE] Batched Gemm Kernel IsSupported function checks (#2860) aledudek 2025-10-13 13:55:23 +02:00
  • 3021604213 [CK_TILE] Batched Gemm Kernel IsSupported function checks (#2860) aledudek 2025-10-13 13:55:23 +02:00
  • d62f34348a Skeleton for the ckTileProfiler. Ville Pietilä 2025-10-13 11:40:31 +00:00
  • cd354286c1 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Tianxing Wu 2025-10-13 11:32:30 +00:00
  • be58d51d36 o ptr and window Tianxing Wu 2025-10-13 11:32:28 +00:00
  • cca873a770 Update include path to break the remod's cyclic dep issue (#2978) damien-lejeune 2025-10-13 13:24:47 +02:00
  • b904c41e44 Update include path to break the remod's cyclic dep issue (#2978) damien-lejeune 2025-10-13 13:24:47 +02:00
  • 46c10c316d Update include path to break the remod's cyclic dep issue (#2978) damien-lejeune 2025-10-13 13:24:47 +02:00
  • 19ea53596e Merge commit 'e9f0cc83a8f3f94ad8462e50a9d9a92d8dca3388' into develop assistant-librarian[bot] 2025-10-13 11:11:52 +00:00
  • 6ba25b7e84 add commenting Juuso Korhonen 2025-10-13 10:34:55 +00:00
  • bcc9d9e514 [CK Tile] contraction multi d - kernel & example (#2901) msaffari-amd 2025-10-13 12:30:28 +02:00
  • b9f7381f95 [CK Tile] contraction multi d - kernel & example (#2901) msaffari-amd 2025-10-13 12:30:28 +02:00
  • e9f0cc83a8 [CK Tile] contraction multi d - kernel & example (#2901) msaffari-amd 2025-10-13 12:30:28 +02:00
  • 81a02ffb40 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Juuso Korhonen 2025-10-13 10:30:22 +00:00
  • b721f79f99 fix Juuso Korhonen 2025-10-13 10:30:11 +00:00
  • 16129a794a stride fix Tianxing Wu 2025-10-13 10:30:08 +00:00
  • 96fde33ec4 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Tianxing Wu 2025-10-13 10:29:07 +00:00
  • 55fc6d7151 kv tensor view Tianxing Wu 2025-10-13 10:28:02 +00:00
  • af94aaf1cb refactor the q tensor view transformation Juuso Korhonen 2025-10-13 10:22:52 +00:00
  • 49ce980c67 Merge branch 'tianxing/unified-attention' of https://github.com/ROCm/composable_kernel into tianxing/unified-attention Juuso Korhonen 2025-10-13 10:21:27 +00:00
  • 2d6dab29eb refactor the q tensor view transformation Juuso Korhonen 2025-10-13 10:18:23 +00:00
  • 36a65b1968 refactor Tianxing Wu 2025-10-13 10:05:23 +00:00
  • 94569f3991 Build only grouped conv profilers. Ville Pietilä 2025-10-13 10:01:42 +00:00
  • bc6385f389 Some refactor Tianxing Wu 2025-10-13 10:01:38 +00:00
  • c984225db0 rm useless code lalala-sh 2025-10-13 09:12:17 +00:00
  • c9d1a9a025 fix typo lalala-sh 2025-10-13 08:10:58 +00:00
  • afc2405d98 try to fix format lalala-sh 2025-10-13 07:20:48 +00:00
  • 23394088f7 Merge commit '95bdc7410c99096652618759ff2ef3586951a0d0' into develop assistant-librarian[bot] 2025-10-13 07:13:38 +00:00
  • bbbea030c2 Add aiter pytest add-aiter-pytest Ding, Yi 2025-10-09 06:52:58 +00:00
  • 2d6547b18c [CK_TILE] FMHA BWD Add Instance for D48 on GFX950 (#2866) Yi DING 2025-10-13 15:03:46 +08:00
  • 2ff24ef58c [CK_TILE] FMHA BWD Add Instance for D48 on GFX950 (#2866) Yi DING 2025-10-13 15:03:46 +08:00
  • 95bdc7410c [CK_TILE] FMHA BWD Add Instance for D48 on GFX950 (#2866) Yi DING 2025-10-13 15:03:46 +08:00
  • 47ce811b3d code clean lalala-sh 2025-10-13 06:41:53 +00:00
  • 848782f41b fix example lalala-sh 2025-10-13 06:35:02 +00:00
  • 739a683cc9 Merge branch 'develop' into wjx/preshuffle_format wjx/preshuffle_format lalala-sh 2025-10-13 11:15:12 +08:00
  • 76201e9af5 enable atomic_add_bf16 in gfx950 wjx/atomic_add_bf16 lalala-sh 2025-10-13 02:52:46 +00:00
  • 8cc3310f33 code clean lalala-sh 2025-10-13 02:44:13 +00:00
  • 79c0f54a5d port ck_tile to cm9 feiw/dev/ckt_cm9 feifei14119 2025-09-01 15:43:43 +08:00
  • 22a7b31865 Change to pipeline so that it is easier to add support of using softmax Qianfeng Zhang 2025-10-11 10:11:35 +00:00
  • d308b09fae Remove using IGLP method for instruction scheduling for kUseLocal true path Qianfeng Zhang 2025-10-11 06:38:32 +00:00
  • ef06eef341 Merge commit 'f5708882a3c0f391b7d02f5af926964170bd8f4e' into develop assistant-librarian[bot] 2025-10-11 13:14:03 +00:00
  • 31f0642364 Streamk functional tests (#2974) Christopher Millette 2025-10-11 07:53:40 -05:00
  • 7df03bedec Streamk functional tests (#2974) Christopher Millette 2025-10-11 07:53:40 -05:00
  • f5708882a3 Streamk functional tests (#2974) Christopher Millette 2025-10-11 07:53:40 -05:00
  • ccea5c423a Merge commit '0843815db7763cf5650f7803185a3ab9d24194d7' into develop assistant-librarian[bot] 2025-10-11 02:35:26 +00:00
  • b1acf5cbf5 Fix GCC 7 CTAD compilation error in test_fmha_bwd.cpp (#3001) John Shumway 2025-10-10 19:13:34 -07:00
  • 7a3d9b8a40 Fix GCC 7 CTAD compilation error in test_fmha_bwd.cpp (#3001) John Shumway 2025-10-10 19:13:34 -07:00
  • 0843815db7 Fix GCC 7 CTAD compilation error in test_fmha_bwd.cpp (#3001) John Shumway 2025-10-10 19:13:34 -07:00
  • 702281d223 Merge commit '3c39d279ab4569d1b33399e7746465744ed662c0' into develop assistant-librarian[bot] 2025-10-10 23:11:29 +00:00
  • 99902d395c supporting prefill shapes for preshuffle block scale gemm (#2975) Khushbu Agarwal 2025-10-10 15:36:24 -07:00
  • fdb397b2c9 supporting prefill shapes for preshuffle block scale gemm (#2975) Khushbu Agarwal 2025-10-10 15:36:24 -07:00
  • 3c39d279ab supporting prefill shapes for preshuffle block scale gemm (#2975) Khushbu Agarwal 2025-10-10 15:36:24 -07:00
  • 07d14c9618 Merge commit '9d060d3e3c7c943a6609a95e11ff48c35b30edef' into develop assistant-librarian[bot] 2025-10-10 20:21:35 +00:00
  • 5e588dba5c [CK-Tile] functional support for transposed inputs in compute-bound double-lds-buffer pipeline with async loads from global memory to LDS (#2984) Max Podkorytov 2025-10-10 12:57:50 -07:00