Commit Graph

  • dc4366a876 add main include file Sami Remes 2026-02-06 18:12:54 +00:00
  • 06a8998254 clean up kernel and pipeline code Sami Remes 2026-02-06 18:11:17 +00:00
  • 241ee59880 clean up example a bit Sami Remes 2026-02-06 18:07:36 +00:00
  • 41353c8f3c [rocm-libraries] ROCm/rocm-libraries#4378 (commit d8e2826) Geo Min 2026-02-06 18:00:27 +00:00
  • 757438ef9c [ci] Adding mi350 required group ID (#4378) Geo Min 2026-02-06 09:59:29 -08:00
  • 8d236a8ff7 [ci] Adding mi350 required group ID (#4378) Geo Min 2026-02-06 09:59:29 -08:00
  • c588a1fd42 use unpacked scales Sami Remes 2026-02-06 17:26:03 +00:00
  • 061c9f9374 save packing approach Sami Remes 2026-02-06 15:54:57 +00:00
  • 0711f4f90a Add is_cross_attention as both host API and kernel parameter so that separate masking rules are used for self or cross attention Qianfeng Zhang 2026-02-06 15:40:07 +00:00
  • ec1e8ec58e Add benchmark example Damien Lejeune 2026-02-06 14:55:13 +00:00
  • 804a9d488c Improve parameterized tests Damien Lejeune 2026-02-06 10:59:21 +00:00
  • deecabacf8 Use defined shape Matti Eskelinen 2026-02-06 10:43:54 +00:00
  • e7ebd6c288 Readd naive normalization in mhc v3 Damien Lejeune 2026-02-06 09:44:20 +00:00
  • 2b5a5e364c Cleanup Matti Eskelinen 2026-02-06 10:27:17 +00:00
  • 683250e41b fix buffer size for output Matti Eskelinen 2026-02-06 09:33:33 +00:00
  • 3b31e42359 WIP (algorithm correct, but inaccurate) Matti Eskelinen 2026-02-06 08:38:12 +00:00
  • f397c022b0 gfx950 support for warptile with Double access num tlakshma_950_support root 2026-02-06 02:02:27 +00:00
  • 4dd4869fbf [rocm-libraries] ROCm/rocm-libraries#4361 (commit 37a74ef) Illia Silin 2026-02-06 01:07:34 +00:00
  • 2a054fc767 [CK] a bunch of CI fixes. (#4361) Illia Silin 2026-02-05 17:06:57 -08:00
  • 4dc5f52f57 [CK] a bunch of CI fixes. (#4361) Illia Silin 2026-02-05 17:06:57 -08:00
  • e96beb1f3e [rocm-libraries] ROCm/rocm-libraries#4352 (commit 3c9beb3) Eiden Yoshida 2026-02-05 22:57:20 +00:00
  • f382d48125 [CK] MICI: Fix git diff in selective_test_filter.py (#4352) Eiden Yoshida 2026-02-05 17:56:12 -05:00
  • 41fd407963 [CK] MICI: Fix git diff in selective_test_filter.py (#4352) Eiden Yoshida 2026-02-05 17:56:12 -05:00
  • 58549aa787 [rocm-libraries] ROCm/rocm-libraries#4360 (commit 5aa1f1d) Geo Min 2026-02-05 19:02:46 +00:00
  • 684654cf84 [ci] Updating variable group-id for OSSCI (#4360) Geo Min 2026-02-05 11:01:53 -08:00
  • 01302d22b5 [ci] Updating variable group-id for OSSCI (#4360) Geo Min 2026-02-05 11:01:53 -08:00
  • a8d48f9224 now offsetting with M/MPerXdl to get scales Sami Remes 2026-02-05 17:31:32 +00:00
  • d34d95b288 CK Tile Engine : Changes to Add fp8 double support in dispatcher root 2026-02-05 17:26:14 +00:00
  • 053aed9402 MHC V3 with gemm pipeline Damien Lejeune 2026-02-05 17:11:09 +00:00
  • d560f36399 [CK TILE] fix bugs of preshuffle_b congma/dev/fix_preshuffle_b Cong Ma 2026-01-29 22:18:26 -05:00
  • 344d98781b [rocm-libraries] ROCm/rocm-libraries#4351 (commit 3b98c98) Jobbins 2026-02-05 15:57:21 +00:00
  • d169ed2194 Change to tile setting to use mfma-32x32x16 for WithSoftmax pipeline on gfx950 Qianfeng Zhang 2026-02-05 15:57:18 +00:00
  • 3a02862241 [rocm-libraries] ROCm/rocm-libraries#4349 (commit 9bb7f5c) Eiden Yoshida 2026-02-05 15:56:52 +00:00
  • 9ac0ae7cba [composablekernel] fix failure status (#4351) Jobbins 2026-02-05 08:56:42 -07:00
  • ec787e6fa2 [composablekernel] fix failure status (#4351) Jobbins 2026-02-05 08:56:42 -07:00
  • ece63708f2 [CK] MICI: Correct path for build trace script (#4349) Eiden Yoshida 2026-02-05 10:55:44 -05:00
  • 9e00e291dc [CK] MICI: Correct path for build trace script (#4349) Eiden Yoshida 2026-02-05 10:55:44 -05:00
  • 8af5e26717 Add softmax selection to two of the testing scripts Qianfeng Zhang 2026-02-05 14:56:31 +00:00
  • 43a5678fdf WIP: MHC v3 Damien Lejeune 2026-02-05 13:04:18 +00:00
  • 0662a6c799 Debugging WIP Matti Eskelinen 2026-02-05 11:39:50 +00:00
  • c4daaf2334 fix packing in example Sami Remes 2026-02-05 10:29:19 +00:00
  • 350022827f init=1 init=2 working, some scales are still wrong as init=0 failing Sami Remes 2026-02-05 10:28:49 +00:00
  • eb7a0c1194 Removed debug print statement streamhpc/wavetile_transfer_bwd_data_and_bwd_wei-support apoorva 2026-02-05 10:20:51 +00:00
  • 6c61804665 try to enable scale loading in kernel and pipeline Sami Remes 2026-02-05 09:24:47 +00:00
  • 3f42f76b45 [rocm-libraries] ROCm/rocm-libraries#4336 (commit d26a782) Eiden Yoshida 2026-02-05 02:44:29 +00:00
  • e3f6354c67 [CK] MICI: Use reference repo for checkout operations (#4336) Eiden Yoshida 2026-02-04 21:43:22 -05:00
  • 606d2aaf31 [CK] MICI: Use reference repo for checkout operations (#4336) Eiden Yoshida 2026-02-04 21:43:22 -05:00
  • 4699dbf0dd Merge remote-tracking branch 'origin/develop' into andriy/ck_tile/basic-tutorials andriy/ck_tile/basic-tutorials Andriy Roshchenko 2026-02-04 23:43:18 +00:00
  • 7b18f5fed2 [rocm-libraries] ROCm/rocm-libraries#4263 (commit f34aec2) Jeff Huang 2026-02-04 23:26:20 +00:00
  • 9c0d4114ae [CK] Add FP8 KV_BLOCKSCALE support for batch prefill (#4263) assistant-librarian[bot] 2026-02-04 18:25:31 -05:00
  • 4231c8d673 [CK] Add FP8 KV_BLOCKSCALE support for batch prefill (#4263) assistant-librarian[bot] 2026-02-04 18:25:31 -05:00
  • a537b8e897 WIP: Padding Andriy Roshchenko 2026-02-04 21:22:53 +00:00
  • 78d0f92f71 add support and tests for conv v6 pipeline jakpiase/ck_tile/conv_pipeline_v5 Jakub Piasecki 2026-02-04 18:24:02 +00:00
  • 25c3f26747 Clang format fix apoorva 2026-02-04 17:37:28 +00:00
  • 32e82acf52 Updated instances for test fix apoorva 2026-02-04 17:36:55 +00:00
  • 62fbda4d1e [rocm-libraries] ROCm/rocm-libraries#4310 (commit 7f63aa1) Illia Silin 2026-02-04 17:35:17 +00:00
  • 170d49eb2c CK CI migration. (#4310) Illia Silin 2026-02-04 09:34:38 -08:00
  • 2df84787b6 CK CI migration. (#4310) Illia Silin 2026-02-04 09:34:38 -08:00
  • 15f427d5b3 test congma/dev/abquant_failure Cong Ma 2026-02-04 11:46:50 -05:00
  • 53201d2081 Disable some cases. Ville Pietilä 2026-02-04 10:47:12 -05:00
  • 7660fa0a2e Fix gitignore. Ville Pietilä 2026-02-04 10:07:26 -05:00
  • e2225e2baa Git ignore rocprofv3 files. Ville Pietilä 2026-02-04 10:05:14 -05:00
  • c97795b139 Remove .dat file. Ville Pietilä 2026-02-04 10:04:55 -05:00
  • 1c842f39a4 Git ignore profiler output. Ville Pietilä 2026-02-04 10:02:57 -05:00
  • 1c1ac4ef10 Small fixes to runner script. Ville Pietilä 2026-02-04 09:55:05 -05:00
  • 73b459c5a4 Runner script for benchmarking. Ville Pietilä 2026-02-04 09:38:16 -05:00
  • 7bd5a1d460 Extract layout order mapping into unified helper function jeonghyun/ckb-almiopen-522-descriptor-init JH-Leon-KIM-AMD 2026-02-04 13:51:21 +00:00
  • ad0db05b04 Only link with hiprtc when necessary. migraphx-ci-fix Mirza Halilcevic 2026-02-04 13:01:06 +00:00
  • 3e6415a8ea True baseline benchmarking results. Ville Pietilä 2026-02-04 07:55:28 -05:00
  • f03c931728 Improve debug output. Mirza Halilcevic 2026-02-04 06:37:36 -06:00
  • 20cf6df685 Best instances for benchmark shapes. Ville Pietilä 2026-02-04 07:27:34 -05:00
  • e18c83594b Add debug output. Mirza Halilcevic 2026-02-04 06:11:07 -06:00
  • 9ccde0cfc3 Specify PATHS for find_package(hiprtc). Mirza Halilcevic 2026-02-04 11:40:00 +00:00
  • 2cfc4209bb Profile optionally only a given instance. Ville Pietilä 2026-02-04 06:31:36 -05:00
  • f6f381dbd4 Benchmarking shapes and baseline results. Ville Pietilä 2026-02-04 06:06:02 -05:00
  • 403f36ed26 Disable building all but fwd convs for CK profiler. Ville Pietilä 2026-02-04 06:03:16 -05:00
  • f833c83ed2 Benchmarking shapes and baseline results. vpietila/retina-net-fwd-convs-baseline Ville Pietilä 2026-02-04 06:06:02 -05:00
  • da12672159 Disable building all but fwd convs for CK profiler. Ville Pietilä 2026-02-04 06:03:16 -05:00
  • 951ee54edc [CK] CK Tile grouped convolution direct load barkocot/ck-tile-direct-load-conv Bartlomiej Kocot 2026-02-04 10:41:14 +00:00
  • 556c904938 Add codegen for CK Tile bwd weight and bwd data convs. vpietila/ckb-add-ck-tile-bwd-weight-instances Ville Pietilä 2026-02-04 03:08:16 -05:00
  • fe5c0ce9b5 [CK_BUILDER] Replace to_spatial_array with FilterExtent::to_array JH-Leon-KIM-AMD 2026-02-04 08:00:49 +00:00
  • 40f0b122f7 disable daily cron jobs in standalone repo aick-647 illsilin_amdeng 2026-02-03 17:35:54 -08:00
  • 670fb88e82 [CK_TILE] Initialize CShuffleEpilogue test input on host side cshuffle-epilogue-tests Max Podkorytov 2026-02-03 20:13:23 -05:00
  • 2e3a716d72 Add new MFMA 16x16x16x2 example for GEMM with PADDING_K_FIRST optimization aviralgoel/gemm_tutorial AviralGoelAMD 2026-02-03 23:06:07 +00:00
  • bf4a662447 build fix bwd_data apoorva 2026-02-03 22:59:16 +00:00
  • 82a79cc337 Merge remote-tracking branch 'origin/develop' into preserved/composablekernel Ameya Keshava Mallya 2026-02-03 22:50:14 +00:00
  • ab1efa0334 Merge remote-tracking branch 'origin/develop' into preserved/composablekernel Ameya Keshava Mallya 2026-02-03 22:50:14 +00:00
  • 2e105bf36d Fix building issue on gfx950 congma/ck_tile/fix_preshuffle_b Cong Ma 2026-02-03 17:40:37 -05:00
  • ef80bd1641 [CK_TILE] Enhance CShuffleEpilogue tests with comprehensive coverage and improved verification Max Podkorytov 2026-02-03 15:56:58 -05:00
  • a82cdcb430 Migrating to rocm-libraries develop_deprecated Ameya Keshava Mallya 2026-02-03 22:30:05 +00:00
  • db2688820e Merge commit '421b714f139fda3361eb4d83a3a87fd8cc1cf169' into develop assistant-librarian[bot] 2026-02-03 18:31:31 +00:00
  • 8ecb21e120 Adding Additional Failure Patterns for Alerts (#3663) andrew clark 2026-02-03 11:23:07 -07:00
  • dc0dc337a6 Adding Additional Failure Patterns for Alerts (#3663) andrew clark 2026-02-03 11:23:07 -07:00
  • 421b714f13 Adding Additional Failure Patterns for Alerts (#3663) andrew clark 2026-02-03 11:23:07 -07:00
  • 34acc7c10b [CK_BUILDER] Remove legacy conv_fwd_ck_tile.hpp JH-Leon-KIM-AMD 2026-02-03 18:12:13 +00:00
  • 26ef84c026 [CK_BUILDER] Refactor: Remove ConvFwdProblem, use direct conversion JH-Leon-KIM-AMD 2026-02-03 18:11:49 +00:00
  • c89dfbf153 Fix initialization examples streamhpc/mix_prec_microscaling_bquant Enrico Degregori 2026-02-03 18:03:57 +00:00
  • fcd51e120c Merge branch 'develop' into cshuffle-fix cshuffle-fix Thomas Ning 2026-02-03 10:02:19 -08:00
  • aef327296e Revert "Implement device grouped gemm fixed nk multi abd for rdna4 (#3619)" (#3705) Illia Silin 2026-02-03 09:52:14 -08:00
  • ea1f04464b Revert "Implement device grouped gemm fixed nk multi abd for rdna4 (#3619)" (#3705) Illia Silin 2026-02-03 09:52:14 -08:00