Commit Graph

  • 54cd431f16 Improve the softmax+trload pipeline by using kN0=64 and prefetch only two k tiles Qianfeng Zhang 2025-11-05 16:23:05 +00:00
  • 0604b49864 Update Readme. Ville Pietilä 2025-11-05 15:27:21 +00:00
  • 2bdb5d1655 Remove obsolete files. Ville Pietilä 2025-11-05 15:09:28 +00:00
  • 03a65de6bc Add instance registry to code generation. Ville Pietilä 2025-11-05 15:06:56 +00:00
  • 8bfbdf6935 Add instance registry. Ville Pietilä 2025-11-05 15:03:19 +00:00
  • d190af2ef5 Tiny fix in trload with_softmax/no_softmax pipeline Qianfeng Zhang 2025-11-05 14:18:37 +00:00
  • 4626bace60 alloc divice memory for scale ltqin 2025-11-05 14:31:22 +00:00
  • e96fb6555c Fix code generation. Ville Pietilä 2025-11-05 14:23:49 +00:00
  • 32d40f188c Fix build paths. Ville Pietilä 2025-11-05 13:51:48 +00:00
  • 82f04195f9 Add generation of device op instances via script. Ville Pietilä 2025-11-05 13:19:35 +00:00
  • ea5e21aa09 Remove redundant fields from JSON. Ville Pietilä 2025-11-05 06:07:09 -06:00
  • 082749ba81 xcd remap Tianxing Wu 2025-11-05 12:05:15 +00:00
  • 98f15eedcf Added block q Tianxing Wu 2025-11-05 11:44:10 +00:00
  • b0d8bf4ca3 [CK TILE] Convolution remove magix values Bartlomiej Kocot 2025-11-05 11:05:42 +00:00
  • 2349af01c7 block quant data ltqin 2025-11-05 11:01:54 +00:00
  • d9bb5c3535 Fix build. Ville Pietilä 2025-11-05 10:30:44 +00:00
  • fff8d16e93 Fix build. Ville Pietilä 2025-11-05 10:30:44 +00:00
  • f410a34b43 Add missing concept. Ville Pietilä 2025-11-05 10:29:16 +00:00
  • 525870a81e Fix build. vpietila/ckb-add-defaults-for-optional-template-params Ville Pietilä 2025-11-05 10:30:44 +00:00
  • 11803c73d6 Add missing concept. Ville Pietilä 2025-11-05 10:29:16 +00:00
  • 1213ef7618 Add missing concept. Ville Pietilä 2025-11-05 10:29:16 +00:00
  • bcb916f45e Merge remote-tracking branch 'origin/vpietila/ckb-remove-explicit-device-op-flag' into vpietila/ckb-fwd-bwd-instances Ville Pietilä 2025-11-05 04:10:09 -06:00
  • 3c6aae58f7 Refactor JSON scripts. Ville Pietilä 2025-11-05 04:08:53 -06:00
  • deec3a0dc1 Remove explicit device op flag from from convolution signature. Ville Pietilä 2025-11-05 09:17:46 +00:00
  • 116e0c1c61 Move generator script. Ville Pietilä 2025-11-05 02:29:00 -06:00
  • f0291b7956 Added missing copyright. Ville Pietilä 2025-11-05 08:15:20 +00:00
  • 30df33538a Merge remote-tracking branch 'origin/develop' into vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-05 08:08:12 +00:00
  • ea517e1c34 Merge commit '3b076b0b74fec1c5a27a808cea45b21c6f526ced' into develop assistant-librarian[bot] 2025-11-05 03:31:59 +00:00
  • a70d21d523 Collecting redis stats (#3149) andrew clark 2025-11-04 19:55:11 -07:00
  • 2cdce54765 Collecting redis stats (#3149) andrew clark 2025-11-04 19:55:11 -07:00
  • 3b076b0b74 Collecting redis stats (#3149) andrew clark 2025-11-04 19:55:11 -07:00
  • bb4b6e5961 Initialize new variable to prevent c++17 compiler error (#3156) Illia Silin 2025-11-04 18:54:14 -08:00
  • 8d454aa01d Initialize new variable to prevent c++17 compiler error (#3156) Illia Silin 2025-11-04 18:54:14 -08:00
  • 930423ab3b Initialize new variable to prevent c++17 compiler error (#3156) Illia Silin 2025-11-04 18:54:14 -08:00
  • c25420e93f formatted khuagarw 2025-11-05 01:20:14 +00:00
  • 57d8d66258 Merge pull request #3078 from spolifroni-amd/spolifroni-amd/cherry-pick-changhelog-changes JeniferC99 2025-11-04 13:26:22 -08:00
  • 7148cc6371 Merge commit '31c019f5891f75a2c9a26cb3d3e61c63596e4c30' into develop assistant-librarian[bot] 2025-11-04 19:11:52 +00:00
  • 4d72320b51 Chunk Ctests so we dont run into large number of tests error (#3050) Vidyasagar Ananthan 2025-11-04 10:31:32 -08:00
  • 42d1855685 Chunk Ctests so we dont run into large number of tests error (#3050) Vidyasagar Ananthan 2025-11-04 10:31:32 -08:00
  • 31c019f589 Chunk Ctests so we dont run into large number of tests error (#3050) Vidyasagar Ananthan 2025-11-04 10:31:32 -08:00
  • 8c8fec6769 Merge commit '5abe4109e0c30993b9e1afe00f95154939043859' into develop assistant-librarian[bot] 2025-11-04 18:15:42 +00:00
  • 0343c4e1fe Introduces the new partitioner to implement the reduction StreamK kernel. (#3107) Cong Ma 2025-11-04 10:32:17 -07:00
  • 53e42f5cce Introduces the new partitioner to implement the reduction StreamK kernel. (#3107) Cong Ma 2025-11-04 10:32:17 -07:00
  • 5abe4109e0 Introduces the new partitioner to implement the reduction StreamK kernel. (#3107) Cong Ma 2025-11-04 10:32:17 -07:00
  • 4d94ea61e1 Merge commit '13ba06f1e75a28037c78c9d75f660f4ab7877d27' into develop assistant-librarian[bot] 2025-11-04 17:11:25 +00:00
  • 1a8f824938 fix the blockscale 2d case (#3148) Thomas Ning 2025-11-04 08:55:23 -08:00
  • dceaa603d0 fix the blockscale 2d case (#3148) Thomas Ning 2025-11-04 08:55:23 -08:00
  • 13ba06f1e7 fix the blockscale 2d case (#3148) Thomas Ning 2025-11-04 08:55:23 -08:00
  • 99993acca4 Improve both the with_softmax and no_softmax pipelines Qianfeng Zhang 2025-11-04 15:18:58 +00:00
  • 32a26d371b Merge commit '0be0288f58879123c228373525c4b438d354694f' into develop assistant-librarian[bot] 2025-11-04 15:13:12 +00:00
  • 4beee639b3 Add python script to create the serialized JSON file of the fwd instances. Ville Pietilä 2025-11-04 09:01:28 -06:00
  • a9d0980ad9 [CK_BUILDER] Update copyright messages. (#3150) John Shumway 2025-11-04 06:35:16 -08:00
  • 8def1a3a64 [CK_BUILDER] Update copyright messages. (#3150) John Shumway 2025-11-04 06:35:16 -08:00
  • 0be0288f58 [CK_BUILDER] Update copyright messages. (#3150) therock-7.10 srayasam/therock-test release/therock-7.10 John Shumway 2025-11-04 06:35:16 -08:00
  • 52204ff4e5 [CK_BUILDER] Add backward weight instance traits for xdl cshuffle. (#3143) John Shumway 2025-11-04 06:34:00 -08:00
  • e6364ac5df [CK_BUILDER] Add backward weight instance traits for xdl cshuffle. (#3143) John Shumway 2025-11-04 06:34:00 -08:00
  • 6dbee64886 [CK_BUILDER] Add backward weight instance traits for xdl cshuffle. (#3143) John Shumway 2025-11-04 06:34:00 -08:00
  • 5b7defb9da Merge commit '8681ced9629f6e952afa5b77c5f3549d60920efa' into develop assistant-librarian[bot] 2025-11-04 14:12:38 +00:00
  • 052c043d99 [CK TILE] Refactor Conv configs and Conv Elementwise (#3151) Bartłomiej Kocot 2025-11-04 15:04:53 +01:00
  • 7b4d9879da [CK TILE] Refactor Conv configs and Conv Elementwise (#3151) Bartłomiej Kocot 2025-11-04 15:04:53 +01:00
  • 8681ced962 [CK TILE] Refactor Conv configs and Conv Elementwise (#3151) Bartłomiej Kocot 2025-11-04 15:04:53 +01:00
  • 1cf6e00cad Merge branch 'vpietila/ckb-fwd-bwd-instances' of github.com:ROCm/composable_kernel into vpietila/ckb-fwd-bwd-instances Ville Pietilä 2025-11-04 07:52:27 -06:00
  • e5528129c8 Add structured representation of the instances. Ville Pietilä 2025-11-04 06:52:18 -06:00
  • 37030df32d Add workspace definition file. Ville Pietilä 2025-11-03 09:57:38 +00:00
  • 8e501e5c70 Add validation rules for builder parameters. Ville Pietilä 2025-11-03 09:57:27 +00:00
  • 156aeb7298 Move instances assets to a dedicated directory. Ville Pietilä 2025-11-03 09:56:59 +00:00
  • dccd8b4ac4 Add listing of all fwd and bwd device ops and instances. Ville Pietilä 2025-10-29 13:32:03 +00:00
  • ca436a0182 start calculate block scale ltqin 2025-11-04 13:20:47 +00:00
  • 0247a89f59 Add structured representation of the instances. Ville Pietilä 2025-11-04 06:52:18 -06:00
  • 5d6427a0fd Removed unnecessary includes. Ville Pietilä 2025-11-04 12:16:09 +00:00
  • c1db7497af Fix clang-formatting. Ville Pietilä 2025-11-04 12:04:09 +00:00
  • 930dcaab25 Merge branch 'develop' into vpietila/ckb-fwd-instance-test-improvements Ville Pietilä 2025-11-04 13:48:41 +02:00
  • 69a93a57f0 Change if-else statements into switch in conv factory. Ville Pietilä 2025-11-04 10:57:50 +00:00
  • adf0a80290 clang-format Ville Pietilä 2025-11-04 08:59:57 +00:00
  • 0ac48abe61 Improve ckb fwd conv instance tests. Ville Pietilä 2025-11-04 08:58:25 +00:00
  • c3857eeba2 Merge branch 'develop' into ck_tile_batched_contraction_kernel_generelizing msaffari-amd 2025-11-04 09:35:06 +01:00
  • 50922e4d32 fix Bartlomiej Kocot 2025-11-04 08:34:23 +00:00
  • 58d420c0a4 Merge commit '99f38e4d9bedcf1b09d58653c354f042f8c509ae' into develop assistant-librarian[bot] 2025-11-04 00:35:23 +00:00
  • a657071c4e [CK TILE] Refactor Conv configs and Conv Elementwise Bartlomiej Kocot 2025-11-03 23:38:56 +00:00
  • a3a55b00d7 [CK TILE] Refactor grouped conv fwd large tensor (#3144) Bartłomiej Kocot 2025-11-04 00:34:48 +01:00
  • a8500a082d [CK TILE] Refactor grouped conv fwd large tensor (#3144) Bartłomiej Kocot 2025-11-04 00:34:48 +01:00
  • 99f38e4d9b [CK TILE] Refactor grouped conv fwd large tensor (#3144) Bartłomiej Kocot 2025-11-04 00:34:48 +01:00
  • b33877c10f Update gridwise_gemm_xdl_cshuffle_conv_v3.hpp barkocot/grouped-conv-bwd-wei-split-k-hack Bartłomiej Kocot 2025-11-03 23:33:13 +01:00
  • 29bbf41353 Supplement the documentation ThomasNing 2025-11-03 21:59:29 +00:00
  • a0410f0a05 Merge commit 'c7ded76cc784f0b4d2c24d3985cb587ad22cbd7f' into develop assistant-librarian[bot] 2025-11-03 21:11:57 +00:00
  • 88f97ab36b Merge branch 'develop' into ck-tile-docs ThomasNing 2025-11-03 20:41:19 +00:00
  • c9e7b735c0 Adding note on CMake convenience script (#3139) Vidyasagar Ananthan 2025-11-03 12:21:57 -08:00
  • 67b2143c63 Adding note on CMake convenience script (#3139) Vidyasagar Ananthan 2025-11-03 12:21:57 -08:00
  • c7ded76cc7 Adding note on CMake convenience script (#3139) Vidyasagar Ananthan 2025-11-03 12:21:57 -08:00
  • 500a0527ac Merge branch 'develop' into samremes/bmatrix_2d_blockscale samremes/bmatrix_2d_blockscale ThomasNing 2025-11-03 20:20:13 +00:00
  • 963079e5e2 quick fix of the group sizes ThomasNing 2025-11-03 20:19:50 +00:00
  • a8059a2e58 Merge commit '507d81c3af51b81f15b946a2a4bef7f594620292' into develop assistant-librarian[bot] 2025-11-03 20:14:18 +00:00
  • 9575bcd099 Fix splitk preshuffle (#3137) Enrico Degregori 2025-11-03 20:59:01 +01:00
  • 9e3e865cec Fix splitk preshuffle (#3137) Enrico Degregori 2025-11-03 20:59:01 +01:00
  • 507d81c3af Fix splitk preshuffle (#3137) Enrico Degregori 2025-11-03 20:59:01 +01:00
  • 7ce8c0cf8f Merge commit '057b7d43b4f1edd4bc6e881403588af8c8e96fd4' into develop assistant-librarian[bot] 2025-11-03 18:14:59 +00:00
  • bf0dc8ce56 fix the compv4 and async pipeline when tile handler is 1 (#3141) Thomas Ning 2025-11-03 09:37:35 -08:00
  • d9bddd569f fix the compv4 and async pipeline when tile handler is 1 (#3141) Thomas Ning 2025-11-03 09:37:35 -08:00
  • 057b7d43b4 fix the compv4 and async pipeline when tile handler is 1 (#3141) Thomas Ning 2025-11-03 09:37:35 -08:00
  • 8a049e4de5 Merge commit '2ec57a8e704f55b545877f6e4f545ebda4a21833' into develop assistant-librarian[bot] 2025-11-03 17:12:19 +00:00