Commit Graph

  • 54f903def5 fix enforcing fixedvectorsizes for ck tile conv (#3344) jakpiase 2025-12-05 09:30:22 +01:00
  • 5147792114 fix enforcing fixedvectorsizes for ck tile conv (#3344) jakpiase 2025-12-05 09:30:22 +01:00
  • f7650ee82b fix enforcing fixedvectorsizes for ck tile conv (#3344) jakpiase 2025-12-05 09:30:22 +01:00
  • 3686d9dc74 Quantization Tianxing Wu 2025-12-05 07:27:27 +00:00
  • a1ed60c087 update zanzhang 2025-12-05 13:40:11 +08:00
  • 6d4949bfd8 Resolve PR comments + add grouped gemm example Kumar 2025-12-05 11:00:10 +05:30
  • 65c7640fd0 update zanzhang 2025-12-05 12:00:51 +08:00
  • eeadb34e8f Merge commit '13f6d635653bd5ffbfcac8577f1ef09590c23d78' into develop assistant-librarian[bot] 2025-12-05 03:38:26 +00:00
  • 62e5b29702 Clean up conv_traits.hpp (#3354) John Shumway 2025-12-04 19:12:36 -08:00
  • b1dc9e64f6 Clean up conv_traits.hpp (#3354) John Shumway 2025-12-04 19:12:36 -08:00
  • 13f6d63565 Clean up conv_traits.hpp (#3354) John Shumway 2025-12-04 19:12:36 -08:00
  • 3b00e4022d Add jenga test and pre-commit Jiangyon 2025-12-05 03:07:11 +00:00
  • 5da2114921 Merge commit '05292b3604e143e98ec2cb67edb2e3d2ad1d6ecb' into develop assistant-librarian[bot] 2025-12-05 02:45:20 +00:00
  • d96f632fa1 [CK_TILE][FMHA] Integrate FAv2 & FAv3 (WIP) in the single fmha_fwd() API (#3153) Po Yen Chen 2025-12-05 10:31:12 +08:00
  • 5737132878 [CK_TILE][FMHA] Integrate FAv2 & FAv3 (WIP) in the single fmha_fwd() API (#3153) Po Yen Chen 2025-12-05 10:31:12 +08:00
  • 05292b3604 [CK_TILE][FMHA] Integrate FAv2 & FAv3 (WIP) in the single fmha_fwd() API (#3153) Po Yen Chen 2025-12-05 10:31:12 +08:00
  • 96ff482d8d fix hipblaslt build for different archs (#3358) Illia Silin 2025-12-04 18:29:14 -08:00
  • 88a222f851 fix hipblaslt build for different archs (#3358) Illia Silin 2025-12-04 18:29:14 -08:00
  • d1193e8637 fix hipblaslt build for different archs (#3358) Illia Silin 2025-12-04 18:29:14 -08:00
  • 48744f2d0d initial commit with prints khuagarw 2025-12-05 01:45:44 +00:00
  • 3021c7af1e adding test cases khuagarw 2025-12-04 21:59:42 +00:00
  • 19b78e9e4b Merge remote-tracking branch 'origin/develop' into 2dQuantPreshuffleWeight khuagarw 2025-12-04 20:16:14 +00:00
  • a9c43a3678 Merge commit 'd184eed823ca50dcafc57c66228f12300c0c9ccc' into develop assistant-librarian[bot] 2025-12-04 20:13:50 +00:00
  • 1f7aa130eb [CK-Tile] Refactor base pipeline usage (#3251) Max Podkorytov 2025-12-04 11:45:49 -08:00
  • e8e9f89bbe [CK-Tile] Refactor base pipeline usage (#3251) Max Podkorytov 2025-12-04 11:45:49 -08:00
  • d184eed823 [CK-Tile] Refactor base pipeline usage (#3251) Max Podkorytov 2025-12-04 11:45:49 -08:00
  • 4b6531908e Merge commit 'd9d4c9c3dfe38fe54bae5b3b1b9b523b011992dd' into develop assistant-librarian[bot] 2025-12-04 19:25:29 +00:00
  • c25f2909d0 [composable_kernel] initial draft of the ck tile conceptual doc (#3242) spolifroni-amd 2025-12-04 14:09:21 -05:00
  • 7afa7d9e43 [composable_kernel] initial draft of the ck tile conceptual doc (#3242) spolifroni-amd 2025-12-04 14:09:21 -05:00
  • d9d4c9c3df [composable_kernel] initial draft of the ck tile conceptual doc (#3242) spolifroni-amd 2025-12-04 14:09:21 -05:00
  • 12d764e999 update yadaish 2025-12-04 18:47:53 +00:00
  • e37dbc7e49 update yadaish 2025-12-04 18:35:00 +00:00
  • ef037eb259 update yadaish 2025-12-04 18:30:09 +00:00
  • bd6897d432 update yadaish 2025-12-04 18:28:17 +00:00
  • 8d4f766f9c Add scale and bias to batch normalization Mohsen Saffari 2025-12-04 16:22:32 +00:00
  • 3becd86717 Merge commit 'cd21e20ae7d4d3a6309ce238bb94814e145585d6' into develop assistant-librarian[bot] 2025-12-04 15:14:37 +00:00
  • 38076077ab build latest hipblaslt in ck_pytorch docker (#3347) Illia Silin 2025-12-04 06:58:42 -08:00
  • 646dc6133d build latest hipblaslt in ck_pytorch docker (#3347) Illia Silin 2025-12-04 06:58:42 -08:00
  • cd21e20ae7 build latest hipblaslt in ck_pytorch docker (#3347) Illia Silin 2025-12-04 06:58:42 -08:00
  • da2daf25dd Merge commit '9cb1f421bce29cb70bf7905858d2f8823f586621' into develop assistant-librarian[bot] 2025-12-04 11:12:52 +00:00
  • 61bfc42b73 Add placeholder for conv algorithm design description. Add link to conv factory description. Ville Pietilä 2025-12-04 11:12:20 +00:00
  • fe4ec78c6c Remove stale documentation. Ville Pietilä 2025-12-04 11:00:46 +00:00
  • 419aa4e420 [CK_BUILDER] Refactor convolution signature to provide data type/layout/elementwise op per tensor (#3331) Ville Pietilä 2025-12-04 12:58:31 +02:00
  • 7edfcf1b58 [CK_BUILDER] Refactor convolution signature to provide data type/layout/elementwise op per tensor (#3331) Ville Pietilä 2025-12-04 12:58:31 +02:00
  • 9cb1f421bc [CK_BUILDER] Refactor convolution signature to provide data type/layout/elementwise op per tensor (#3331) Ville Pietilä 2025-12-04 12:58:31 +02:00
  • 7059613404 update yadaish 2025-12-04 10:38:45 +00:00
  • 844702d69b Merge branch 'develop' of github.com:ROCm/composable_kernel into barkocot/lwpck-4085 barkocot/lwpck-4085 Bartlomiej Kocot 2025-12-04 09:55:43 +00:00
  • f032bc3f9a [BUILDER] Ck Tile Grouped convolution factory Bartlomiej Kocot 2025-12-04 09:38:17 +00:00
  • a3bbd74c0c Merge commit '583fafc803a0ec9d0edc902fc6b9ecfdc42fb09b' into develop assistant-librarian[bot] 2025-12-04 07:13:58 +00:00
  • a8f5d21fb8 [CK_TILE] Fix for Moving DataTypeTraits into a Common File (#3335) arai713 2025-12-03 22:46:22 -08:00
  • beaa1aa47c [CK_TILE] Fix for Moving DataTypeTraits into a Common File (#3335) arai713 2025-12-03 22:46:22 -08:00
  • 583fafc803 [CK_TILE] Fix for Moving DataTypeTraits into a Common File (#3335) arai713 2025-12-03 22:46:22 -08:00
  • 2171fa6588 Merge commit 'ffc3120f63135cc697e46031523e44c5cd5d43fa' into develop assistant-librarian[bot] 2025-12-04 06:16:54 +00:00
  • 18fe146dd5 Merge branch 'develop' into 2dQuantPreshuffleWeight Thomas Ning 2025-12-03 22:07:57 -08:00
  • a6f43cf9de Ck tile/gemm blockscale opt (#3227) kensclin 2025-12-04 14:07:23 +08:00
  • 9e8836195c Ck tile/gemm blockscale opt (#3227) kensclin 2025-12-04 14:07:23 +08:00
  • ffc3120f63 Ck tile/gemm blockscale opt (#3227) kensclin 2025-12-04 14:07:23 +08:00
  • 170089beb6 Merge commit 'eb7f6177136173c8a6af539bffd915fddff293c4' into develop assistant-librarian[bot] 2025-12-04 04:24:28 +00:00
  • 228b1e8d87 fp8 fmha async pipeline (#3339) rocking 2025-12-04 12:18:25 +08:00
  • ea2e816aa5 fp8 fmha async pipeline (#3339) rocking 2025-12-04 12:18:25 +08:00
  • eb7f617713 fp8 fmha async pipeline (#3339) rocking 2025-12-04 12:18:25 +08:00
  • 244126053c Merge branch 'develop' into 1dQuantPreshuffleWeight amd-khushbu 2025-12-04 01:49:42 +00:00
  • cc01d72e08 Support A/B Quantization in Blockscale GEMM KenSCLin 2025-12-04 01:24:06 +00:00
  • a3729978ce Support A/B Quantization in Blockscale GEMM KenSCLin 2025-12-04 01:24:06 +00:00
  • 8ff98b8095 fix the pre-commit Jiangyon 2025-12-04 01:09:53 +00:00
  • 1dc15fbb15 Merge branch 'develop' into sparse_attention_VSA jiangyon.ren 2025-12-04 08:51:20 +08:00
  • d0d2528beb Merge commit '4baa4c9fae0e56f1105d73a5d2484611d40886e0' into develop assistant-librarian[bot] 2025-12-03 20:13:41 +00:00
  • 250deafb9e [CK, CK_TILE] Add GPU Reference Implementations for Grouped Convolution (#3216) JH-Leon-KIM-AMD 2025-12-03 21:14:21 +02:00
  • 8a6ce28f47 [CK, CK_TILE] Add GPU Reference Implementations for Grouped Convolution (#3216) JH-Leon-KIM-AMD 2025-12-03 21:14:21 +02:00
  • 4baa4c9fae [CK, CK_TILE] Add GPU Reference Implementations for Grouped Convolution (#3216) JH-Leon-KIM-AMD 2025-12-03 21:14:21 +02:00
  • 44b6d9492d Removing repeated code from derived class tileengine-restructure ThruptiRajLakshmanaGowda 2025-12-03 17:25:39 +00:00
  • 416802ca15 Moving code snippet to base class ThruptiRajLakshmanaGowda 2025-12-03 17:23:41 +00:00
  • 93894889a0 Merge branch 'develop' into fmha_fwd_test_all_hdim dlejeune/fmha_fwd_test_all_hdim Damien Lejeune 2025-12-03 17:00:45 +00:00
  • 834ee396bb Merge commit '161835533becff72c71d20eff1e907a702820252' into develop assistant-librarian[bot] 2025-12-03 17:14:58 +00:00
  • 60145b3cce Add vector size management Mohsen Saffari 2025-12-03 16:33:34 +00:00
  • faa7f9ae07 Wmma support for gemm_multiply_multiply_wp (#3278) Enrico Degregori 2025-12-03 16:38:23 +01:00
  • b71ce6f8ac Wmma support for gemm_multiply_multiply_wp (#3278) Enrico Degregori 2025-12-03 16:38:23 +01:00
  • 161835533b Wmma support for gemm_multiply_multiply_wp (#3278) Enrico Degregori 2025-12-03 16:38:23 +01:00
  • 0cb95a3e70 Merge commit 'f29b67cf9b20be44299b2dcdd1716393c9c7569c' into develop assistant-librarian[bot] 2025-12-03 15:14:25 +00:00
  • 6a4a1962b8 [CK_BUILDER] Add Description::instance_string() method and update tests (#3340) John Shumway 2025-12-03 06:36:09 -08:00
  • 8d959a1ec0 [CK_BUILDER] Add Description::instance_string() method and update tests (#3340) John Shumway 2025-12-03 06:36:09 -08:00
  • f29b67cf9b [CK_BUILDER] Add Description::instance_string() method and update tests (#3340) John Shumway 2025-12-03 06:36:09 -08:00
  • 345758971e fix jukorhon/unified-attention-dev Juuso Korhonen 2025-12-03 13:36:48 +00:00
  • 7078de91d8 adding PAGE_BLOCK_SIZE >= BLOCK_SIZE optionality, now it regresses perf when it should improve? Juuso Korhonen 2025-12-03 13:08:29 +00:00
  • 065535cf81 some refactoring to align kernel with ck-tile kernels Mohsen Saffari 2025-12-03 12:03:31 +00:00
  • f0da8a4be5 Support A/B Quantization in Blockscale GEMM KenSCLin 2025-12-03 08:13:37 +00:00
  • 5f10aba31c Merge commit 'e6a583416b0dc534fcd023f90ed2ebf800fdd78b' into develop assistant-librarian[bot] 2025-12-03 10:18:22 +00:00
  • 67c2664625 [CK TILE] Add index optimizations for conv bwd weight (#3321) jakpiase 2025-12-03 10:53:46 +01:00
  • d58a45bb17 [CK TILE] Add index optimizations for conv bwd weight (#3321) jakpiase 2025-12-03 10:53:46 +01:00
  • e6a583416b [CK TILE] Add index optimizations for conv bwd weight (#3321) jakpiase 2025-12-03 10:53:46 +01:00
  • 4731c8e519 Further clarification in using kSubQKHeaddim and kQKHeaddim Qianfeng Zhang 2025-12-03 09:28:15 +00:00
  • ca1370cadc Support A/B Quantization in Blockscale GEMM KenSCLin 2025-12-03 08:13:37 +00:00
  • 2549bc1fee Clarify the using of kSubQKHeaddim and kQKHeaddim Qianfeng Zhang 2025-12-03 08:18:13 +00:00
  • 34a30c7b63 Support A/B Quantization in Blockscale GEMM KenSCLin 2025-12-03 08:13:37 +00:00
  • 8aa9c2bf6a Changing kernel_name_prefix to constructor ThruptiRajLakshmanaGowda 2025-12-03 03:04:18 +00:00
  • 99fd5ede31 update coherence zhimding/develop zanzhang 2025-10-29 20:35:25 +08:00
  • 4fc61d97ce Merge branch 'develop' into sparse_attention_VSA jiangyon.ren 2025-12-03 10:53:42 +08:00
  • 0607d31c77 add sparse attention VSA Jiangyong 2025-12-03 10:36:07 +08:00
  • dd3448b45e resolve merge conflicts with dev lwpck-4180 amd-khushbu 2025-12-03 00:45:21 +00:00
  • 52044c443c resolve merge conflicts with dev amd-khushbu 2025-12-03 00:41:54 +00:00