Commit Graph

  • 308d69498c Merge commit 'e3556fed0453e66cdebc5dad6b903f5e902cd9b4' into develop assistant-librarian[bot] 2026-01-29 00:45:02 +00:00
  • 512846ba2c [CK] Refactor container array and function sequence_reverse_inclusive_scan Cong Ma 2026-01-19 16:48:55 -05:00
  • 29c56b8aae Optimize batch prefill kernel performance for VECTORIZED_LAYOUT KV cache (#3657) Jeff Huang 2026-01-29 07:18:41 +08:00
  • 4fb9c4b82a Optimize batch prefill kernel performance for VECTORIZED_LAYOUT KV cache (#3657) Jeff Huang 2026-01-29 07:18:41 +08:00
  • e3556fed04 Optimize batch prefill kernel performance for VECTORIZED_LAYOUT KV cache (#3657) Jeff Huang 2026-01-29 07:18:41 +08:00
  • 0c3ea9d826 Merge commit '83b58bb0c3ff12f426d45383900a6fd91b4116a1' into develop assistant-librarian[bot] 2026-01-28 21:42:05 +00:00
  • c2892466a9 Grouped Conv Bwd Weight Direct Load (#3648) Bartłomiej Kocot 2026-01-28 22:31:54 +01:00
  • 017d96faaa Grouped Conv Bwd Weight Direct Load (#3648) Bartłomiej Kocot 2026-01-28 22:31:54 +01:00
  • 83b58bb0c3 Grouped Conv Bwd Weight Direct Load (#3648) Bartłomiej Kocot 2026-01-28 22:31:54 +01:00
  • b73e82e2e6 Merge origin/develop into andriy/ck_tile/basic-tutorials Andriy Roshchenko 2026-01-28 21:27:28 +00:00
  • 002e077401 Fix block scale init value (#3666) ltqin 2026-01-29 04:37:15 +08:00
  • ee4e216716 Fix block scale init value (#3666) ltqin 2026-01-29 04:37:15 +08:00
  • 654bec3362 Fix block scale init value (#3666) ltqin 2026-01-29 04:37:15 +08:00
  • 5b6cbd329c Fix gfx942 and 90a ThomasNing 2026-01-28 14:31:54 -06:00
  • 569cb86a9b Merge commit '42048bdb7d8d931966af76c6dacfedce1c9da90a' into rocking/fmha-async-intrinsic rocking/fmha-async-intrinsic rocking 2026-01-28 14:12:33 -06:00
  • 5d0b96e179 Retrigger CI Max Podkorytov 2026-01-28 13:07:36 -05:00
  • dbadcf487a Merge commit '42048bdb7d8d931966af76c6dacfedce1c9da90a' into develop assistant-librarian[bot] 2026-01-28 17:20:56 +00:00
  • 6295d530ea Merge branch 'develop' into vpietila/add-fwd-conv-v3-instances-for-unit-group-size Bartłomiej Kocot 2026-01-28 18:16:49 +01:00
  • 97d6e59580 [CK_BUILDER] Integrate CKB validation with CK verification (#3649) Robin Voetter 2026-01-28 17:41:02 +01:00
  • 7d1574a9ab [CK_BUILDER] Integrate CKB validation with CK verification (#3649) Robin Voetter 2026-01-28 17:41:02 +01:00
  • 42048bdb7d [CK_BUILDER] Integrate CKB validation with CK verification (#3649) Robin Voetter 2026-01-28 17:41:02 +01:00
  • a9fcb27ded add transform grid Jakub Piasecki 2026-01-28 16:35:06 +00:00
  • 05f1759f0e [CK_BUILDER] Add reflection for wmma and bwd weight instances to ck builder reflection (#3592) kabrahamAMD 2026-01-28 17:33:45 +01:00
  • 140727e7c1 [CK_BUILDER] Add reflection for wmma and bwd weight instances to ck builder reflection (#3592) kabrahamAMD 2026-01-28 17:33:45 +01:00
  • d6cccf6093 [CK_BUILDER] Add reflection for wmma and bwd weight instances to ck builder reflection (#3592) kabrahamAMD 2026-01-28 17:33:45 +01:00
  • 3f5d7d1123 add padding to cshuffle epilogue to avoid bank conflict ThomasNing 2026-01-28 10:25:48 -06:00
  • eb3eacebce tmp save between remotes Jakub Piasecki 2026-01-28 16:07:11 +00:00
  • 44960922a2 Merge remote-tracking branch 'origin/jograner/bwd-weight-splitk-autodeduce' into features/grouped-conv-perf-uplift Ville Pietilä 2026-01-28 10:57:40 -05:00
  • c92b954537 Add new fwd conv fp16/bf16 instances optimized for unit group size. Ville Pietilä 2026-01-28 10:19:57 -05:00
  • fc1b683d18 Fix a build break Sami Aario 2026-01-28 15:37:13 +00:00
  • 0033748c62 revert custom ldstile, should be able to use the regular ones Sami Remes 2026-01-28 10:37:13 -05:00
  • c22b9ebe85 Profiling scripts. vpietila/ck-tile-split-k-opt Ville Pietilä 2026-01-28 10:32:32 -05:00
  • 0c5b72426d Add new fwd conv fp16/bf16 instances optimized for unit group size. Ville Pietilä 2026-01-28 10:19:57 -05:00
  • 78b36a13ab Merge commit 'bc6083bdd466d1e060253e7a49626c923293c483' into develop assistant-librarian[bot] 2026-01-28 15:18:44 +00:00
  • d0e9dc510e Merge branch 'develop' into LWPCK-3549-cleanups SamiAario-AMD 2026-01-28 17:14:23 +02:00
  • 28efdbd1c9 Update pytorch version in convolution dataset test generation (#3667) Johannes Graner 2026-01-28 15:38:10 +01:00
  • 04ed7d9ba9 Update pytorch version in convolution dataset test generation (#3667) Johannes Graner 2026-01-28 15:38:10 +01:00
  • bc6083bdd4 Update pytorch version in convolution dataset test generation (#3667) Johannes Graner 2026-01-28 15:38:10 +01:00
  • 029efffeb5 Update test with new applicability Graner, Johannes 2026-01-28 07:24:09 -05:00
  • 0eee2d3392 Fix threshold calculation Graner, Johannes 2026-01-28 09:18:03 -05:00
  • 66257cf9ca Merge branch 'develop' into ck_tile/gemm_blockscale_eightwarps kensclin 2026-01-28 19:05:00 +08:00
  • b83c07748c WIP: arbitrary batch dim Damien Lejeune 2026-01-28 06:00:10 -05:00
  • 19a000f6c3 Merge commit '8e3d84aba3be5e851de5d6c6c3e9c08cadbce1da' into develop assistant-librarian[bot] 2026-01-28 08:16:51 +00:00
  • bb0986e59e [CK_TILE] ABQuant New Preshuffle (#3638) Yi DING 2026-01-28 15:46:49 +08:00
  • 83c5a3b025 [CK_TILE] ABQuant New Preshuffle (#3638) Yi DING 2026-01-28 15:46:49 +08:00
  • 8e3d84aba3 [CK_TILE] ABQuant New Preshuffle (#3638) Yi DING 2026-01-28 15:46:49 +08:00
  • 349fb5206e Merge remote-tracking branch 'origin/develop' into vpietila/retina-net-training-perf Ville Pietilä 2026-01-28 02:38:04 -05:00
  • 55d8e9b4f0 Add missing logic to wmma multiple d kernel Graner, Johannes 2026-01-28 01:56:55 -05:00
  • fe73096c36 Merge branch 'develop' into ck_tile/gemm_blockscale_eightwarps Yi DING 2026-01-28 13:22:29 +08:00
  • 69fc05dd55 Add a readme file to ck/library/util jshumway/util-readme John Shumway 2026-01-27 23:13:29 -05:00
  • 122f8738f1 Adding gfx950 support root 2026-01-28 03:45:47 +00:00
  • bd846580d3 Merge branch 'develop' into tlakshma_tileengine_enable_arch tlakshma_tileengine_enable_arch Thrupti Raj Lakshmana Gowda 2026-01-27 21:35:47 -06:00
  • b6857cc1a2 Revert "poc convert fnuz fp8 to non-native dtype similar to ocp (#2871)" tenpercent/revert-ck-fp8-struct Max Podkorytov 2026-01-27 19:19:53 -05:00
  • d4b61e4db5 Merge commit '91e32f305fa4d809103431a81594c52240753d40' into develop assistant-librarian[bot] 2026-01-27 22:14:22 +00:00
  • bd3ca11235 chore: rename markdown file AviralGoelAMD 2026-01-26 23:28:55 +00:00
  • 373d8dd63d [CK Tile] multi reduce improvements (#3607) damien-lejeune 2026-01-27 21:56:09 +01:00
  • 24d3cbc30d [CK Tile] multi reduce improvements (#3607) damien-lejeune 2026-01-27 21:56:09 +01:00
  • 91e32f305f [CK Tile] multi reduce improvements (#3607) damien-lejeune 2026-01-27 21:56:09 +01:00
  • e9af74cb84 [ck] add gridwise base class for in all xdl kernel (#186) (#3544) linqunAMD 2026-01-28 04:49:47 +08:00
  • 5713c658c6 [ck] add gridwise base class for in all xdl kernel (#186) (#3544) linqunAMD 2026-01-28 04:49:47 +08:00
  • 23cefda140 [ck] add gridwise base class for in all xdl kernel (#186) (#3544) linqunAMD 2026-01-28 04:49:47 +08:00
  • 77db7f0f22 removing api ref etc (#3659) docs/7.2.0 spolifroni-amd 2026-01-27 14:37:30 -05:00
  • 362767eba8 Merge commit 'b737f1dee5a097f8b62156335e21259d8dd2784c' into develop assistant-librarian[bot] 2026-01-27 19:18:39 +00:00
  • 8130aa058e [CK]Refactoring threadwise_tensor_slice_transfer_v3r1.hpp (#3263) Michał Kulikowski 2026-01-27 19:48:16 +01:00
  • bdc1f4846a [CK]Refactoring threadwise_tensor_slice_transfer_v3r1.hpp (#3263) Michał Kulikowski 2026-01-27 19:48:16 +01:00
  • b737f1dee5 [CK]Refactoring threadwise_tensor_slice_transfer_v3r1.hpp (#3263) Michał Kulikowski 2026-01-27 19:48:16 +01:00
  • 417ad9c7f1 Merge commit 'b26cb596b0cbea9f40ae36b3f245b5aa7120c5c9' into develop assistant-librarian[bot] 2026-01-27 18:19:38 +00:00
  • 0acf5e439f [CK TILE] only use trivial_array in sequence.hpp Cong Ma 2026-01-27 13:03:48 -05:00
  • 30d4c25d5a use PackedSize in slicing Sami Remes 2026-01-27 13:01:06 -05:00
  • 71ac48d63a fix some syntax errors (#3658) Illia Silin 2026-01-27 09:59:39 -08:00
  • 7dd38d592e fix some syntax errors (#3658) Illia Silin 2026-01-27 09:59:39 -08:00
  • b26cb596b0 fix some syntax errors (#3658) Illia Silin 2026-01-27 09:59:39 -08:00
  • 08ec1f4192 update example code Sami Remes 2026-01-27 12:57:04 -05:00
  • f62cc5415f current state of pipeline Sami Remes 2026-01-27 12:56:24 -05:00
  • 82ef9432cc [CK TILE] refactor according to the review feedback Cong Ma 2026-01-23 19:59:16 -05:00
  • 51a3caa497 [CK TILE] Refactor sequence_reverse_inclusive_scan Cong Ma 2026-01-19 22:58:41 -05:00
  • 1c736b86c7 [CK TILE] Refactor sequence_reverse_inclusive_scan Cong Ma 2026-01-18 23:02:23 -05:00
  • 42fde4860a Custom global mem write. Ville Pietilä 2026-01-27 11:24:29 -05:00
  • 970869661c Global mem write debugging. Ville Pietilä 2026-01-27 11:23:37 -05:00
  • 21657a1f32 Merge commit '0cc83cb8e8c9d9d926469f862bc1272ef0cf0dc8' into develop assistant-librarian[bot] 2026-01-27 16:17:03 +00:00
  • 0fa0fdc85d Merge remote-tracking branch 'upstream/develop' into congma/ck_tile/preshuffle_b congma/ck_tile/preshuffle_b Cong Ma 2026-01-27 10:38:13 -05:00
  • cf363cb35d CK: removed the api reference (#3571) spolifroni-amd 2026-01-27 10:36:47 -05:00
  • 4cdc3132b3 CK: removed the api reference (#3571) spolifroni-amd 2026-01-27 10:36:47 -05:00
  • 0cc83cb8e8 CK: removed the api reference (#3571) spolifroni-amd 2026-01-27 10:36:47 -05:00
  • 74eb200c73 Fix error threshold calculations Graner, Johannes 2026-01-27 09:23:12 -05:00
  • fea562ee53 Merge commit 'b66597ed96180ce21e7e6a6678dfc232ed07c800' into develop assistant-librarian[bot] 2026-01-27 14:20:24 +00:00
  • ad3954f119 Enable bwd weight splitk autodeduction with cap Graner, Johannes 2026-01-27 08:46:53 -05:00
  • 72fa29bad5 Merge branch 'develop' into LWPCK-3549-cleanups SamiAario-AMD 2026-01-27 15:39:38 +02:00
  • 79a23ff7c3 Re-enable a previously failing fp16 x pkint4 test for MidLargeM re-enable-two-fp16-x-pkint4-tests Sami Aario 2026-01-26 21:13:49 +00:00
  • 0ab0639393 Re-enable a previously failing fp16 x pkint4 test for SmallM Sami Aario 2026-01-26 20:53:46 +00:00
  • 389639fe34 WIP: add naive version + block gemm version + tests & reference Damien Lejeune 2026-01-27 08:22:36 -05:00
  • 078912ec20 Add build time optimization documentation (#3608) Max Podkorytov 2026-01-27 05:07:27 -08:00
  • dbb766d951 Add build time optimization documentation (#3608) Max Podkorytov 2026-01-27 05:07:27 -08:00
  • b66597ed96 Add build time optimization documentation (#3608) Max Podkorytov 2026-01-27 05:07:27 -08:00
  • f669f39eaf WIP Matti Eskelinen 2026-01-27 07:23:12 -05:00
  • 098b4630f9 Fix Ding, Yi 2026-01-27 04:39:40 -05:00
  • d7af03f452 Merge commit '3d67e6c4927a9daea9076fab75b23fb44fdc22b1' into develop assistant-librarian[bot] 2026-01-27 09:19:15 +00:00
  • ab6bbbfee1 [CK TILE] Enable CK TILE Conv Fwd tests in CI and fix check_err (#3624) Bartłomiej Kocot 2026-01-27 10:04:11 +01:00
  • 42638c34b0 [CK TILE] Enable CK TILE Conv Fwd tests in CI and fix check_err (#3624) Bartłomiej Kocot 2026-01-27 10:04:11 +01:00
  • 3d67e6c492 [CK TILE] Enable CK TILE Conv Fwd tests in CI and fix check_err (#3624) Bartłomiej Kocot 2026-01-27 10:04:11 +01:00