Commit Graph

  • 118afa455c [CK_Tile] Support for group size 128 for Preshuffle quant for 2d block scale gemm (#3462) Khushbu Agarwal 2026-01-14 10:00:19 -08:00
  • 911a6b7282 Format build analysis script with ruff tenpercent/ck-build-analysis-skill Max Podkorytov 2026-01-14 11:55:24 -06:00
  • 9f4f9ce6a5 Add multi-file support to build analysis tool Max Podkorytov 2026-01-14 11:49:03 -06:00
  • 35c620ef84 Merge branch 'develop' into LWPCK-3549-cleanups SamiAario-AMD 2026-01-14 19:45:33 +02:00
  • c020a42797 Fix a build break introduced when merging Sami Aario 2026-01-14 17:44:48 +00:00
  • f6f9931541 WIP Sami Remes 2026-01-14 12:07:26 -05:00
  • 7095361e2e Update rocm-docs-core to 1.31.2 docs/7.1.1 ROCm Docs Automation 2026-01-14 11:28:42 -05:00
  • 9a2c80d30f Merge commit '1fc5a3f3ac6bf204305c089f77a898b1f8765903' into develop assistant-librarian[bot] 2026-01-14 16:16:26 +00:00
  • 5d4e07e095 Merge remote-tracking branch 'origin/develop' into samremes/ck_tile_mx_gemm Sami Remes 2026-01-14 10:43:00 -05:00
  • 2eb573a0e2 Build CK on Windows (#3458) Ville Pietilä 2026-01-14 17:31:45 +02:00
  • 712235e237 Build CK on Windows (#3458) Ville Pietilä 2026-01-14 17:31:45 +02:00
  • 1fc5a3f3ac Build CK on Windows (#3458) Ville Pietilä 2026-01-14 17:31:45 +02:00
  • 7501af9cc6 Merge commit 'f173642087ed6034a0ac16188de2f36f4c008945' into develop assistant-librarian[bot] 2026-01-14 15:15:11 +00:00
  • 5d90ee1d85 Merge branch 'develop' into vpietila/ckb-refactor-warp-gemm-descriptors Ville Pietilä 2026-01-14 17:09:08 +02:00
  • b313b8eaea [CK] Refactor GPU verification kernel to gather error stats on GPU (#3551) Johannes Graner 2026-01-14 16:04:50 +01:00
  • 29be1248ff [CK] Refactor GPU verification kernel to gather error stats on GPU (#3551) Johannes Graner 2026-01-14 16:04:50 +01:00
  • f173642087 [CK] Refactor GPU verification kernel to gather error stats on GPU (#3551) Johannes Graner 2026-01-14 16:04:50 +01:00
  • 923923ac8d [CK Profiler] Initialize tensors on GPU in CK profiler (#3550) Johannes Graner 2026-01-14 16:04:14 +01:00
  • e29672610a [CK Profiler] Initialize tensors on GPU in CK profiler (#3550) Johannes Graner 2026-01-14 16:04:14 +01:00
  • 3ccb15ea02 [CK Profiler] Initialize tensors on GPU in CK profiler (#3550) Johannes Graner 2026-01-14 16:04:14 +01:00
  • 346b3fa5bd Merge remote-tracking branch 'origin/develop' into vpietila/ckb-refactor-warp-gemm-descriptors Ville Pietilä 2026-01-14 09:30:38 -05:00
  • 75d20e08f1 clang-format Ville Pietilä 2026-01-14 09:29:04 -05:00
  • 41bf6ecb5a Merge commit '717ed0b59f274990c58b01812fd4e50a39975aad' into develop assistant-librarian[bot] 2026-01-14 14:15:53 +00:00
  • 75ea587550 [CK_TILE][FMHA] Enable gpt-oss sink (#3490) Linjun-AMD 2026-01-14 21:32:06 +08:00
  • e038a25192 [CK_TILE][FMHA] Enable gpt-oss sink (#3490) Linjun-AMD 2026-01-14 21:32:06 +08:00
  • 717ed0b59f [CK_TILE][FMHA] Enable gpt-oss sink (#3490) Linjun-AMD 2026-01-14 21:32:06 +08:00
  • 096592eb99 Refactor conv algorithms into more categorized form. Ville Pietilä 2026-01-14 08:29:49 -05:00
  • 3f7b250d33 Small concepts clean-up. Ville Pietilä 2026-01-14 06:22:46 -05:00
  • 07608d1a86 Rename LDS transfer related assets. Ville Pietilä 2026-01-14 06:15:14 -05:00
  • 6210a83d5e Rename thread distribution to thread cluster. Ville Pietilä 2026-01-14 06:06:05 -05:00
  • af90b2e9dd Merge commit '693ff3bbb3ab01afc90fe12d5f5899d31a2ece62' into develop assistant-librarian[bot] 2026-01-14 10:14:55 +00:00
  • ad907f8d54 Add support for direct store in epilogue and padding support for wave transfer without transpose (#3465) Enrico Degregori 2026-01-14 11:02:19 +01:00
  • b74f6c663c Add support for direct store in epilogue and padding support for wave transfer without transpose (#3465) Enrico Degregori 2026-01-14 11:02:19 +01:00
  • 693ff3bbb3 Add support for direct store in epilogue and padding support for wave transfer without transpose (#3465) Enrico Degregori 2026-01-14 11:02:19 +01:00
  • 6e38ca6139 Merge remote-tracking branch 'origin/develop' into vpietila/ckb-refactor-warp-gemm-descriptors Ville Pietilä 2026-01-14 04:02:44 -05:00
  • ea4e543555 Merge branch 'develop' into LWPCK-3549-cleanups SamiAario-AMD 2026-01-14 10:59:05 +02:00
  • 23ea6ed4c6 Add copyright headers to all shell scripts Max Podkorytov 2026-01-14 00:59:57 -06:00
  • 6c187f54f2 Add copyright header and format with ruff Max Podkorytov 2026-01-14 00:27:42 -06:00
  • 8fcf1595a9 Replace hardcoded recommendations with data-driven insights Max Podkorytov 2026-01-14 00:18:14 -06:00
  • 4b8471b681 Use integer microseconds instead of float milliseconds Max Podkorytov 2026-01-14 00:04:08 -06:00
  • cef3e869b0 Fix command injection and path traversal vulnerabilities Max Podkorytov 2026-01-13 23:56:29 -06:00
  • 28489b05ca Use pipx to install uv instead of piping curl to bash Max Podkorytov 2026-01-13 23:23:10 -06:00
  • 52037f96f1 Auto-install uv for zero-configuration dependency management Max Podkorytov 2026-01-13 23:19:24 -06:00
  • 13655f2757 Extract common utilities and improve default granularity Max Podkorytov 2026-01-13 23:16:42 -06:00
  • caf3f74e12 Use uv run as default execution path for automatic dependency management Max Podkorytov 2026-01-13 22:56:29 -06:00
  • 7e091c06c5 Extract Python script and make PEP 723 compliant Max Podkorytov 2026-01-13 22:50:17 -06:00
  • fc53e81355 Refactor report generation to use Jinja2 templates Max Podkorytov 2026-01-13 22:41:51 -06:00
  • 0fc7bfefbd Add ck-build-analysis skill for compilation profiling Max Podkorytov 2026-01-13 21:13:44 -06:00
  • ab768af196 Refactor WarpGemm dispatcher and compose attributes LWPCK-3731 Jeff Huang 2025-11-07 12:06:15 +08:00
  • ba65875e4d combine build and rebuild tenpercent/cc-skill-build Max Podkorytov 2026-01-13 19:49:00 -06:00
  • 7d18bd4c4b Merge branch 'develop' into tenpercent/cc-skill-build Max Podkorytov 2026-01-13 17:16:00 -08:00
  • 5ba39269c7 try to handle corner cases Max Podkorytov 2026-01-13 18:54:51 -06:00
  • 2dbf9c368b Merge commit '51027474afe07ba069123f37798867270d59ac12' into develop assistant-librarian[bot] 2026-01-14 00:40:04 +00:00
  • 014881027d add the skill for running a docker container with correct options; build and run tests in the container Max Podkorytov 2026-01-13 18:38:57 -06:00
  • e231cfb3dc [CK TILE ENGINE] CI fix for Basic Tile Engine (#3554) Thrupti Raj Lakshmana Gowda 2026-01-13 18:20:30 -06:00
  • 183c01c8f1 [CK TILE ENGINE] CI fix for Basic Tile Engine (#3554) Thrupti Raj Lakshmana Gowda 2026-01-13 18:20:30 -06:00
  • 51027474af [CK TILE ENGINE] CI fix for Basic Tile Engine (#3554) Thrupti Raj Lakshmana Gowda 2026-01-13 18:20:30 -06:00
  • 7897628111 initiaiate async_v2 pipeline rocking 2026-01-13 15:10:45 -06:00
  • a379e96169 Merge pull request #3558 from ROCm/spolifroni-amd-fix-small-issue spolifroni-amd 2026-01-13 15:42:49 -05:00
  • 40d24b0587 Update buffer_views.rst spolifroni-amd 2026-01-13 14:55:06 -05:00
  • cbe8d381cc debugging khuagarw 2026-01-13 18:26:37 +00:00
  • a648b6c373 Merge commit '00c46785a8a590bfe76b3fae20f23109a2685f4d' into develop assistant-librarian[bot] 2026-01-13 18:17:38 +00:00
  • 0c8c232a0a Shuffle fix for gfx950 (#3491) Thomas Ning 2026-01-14 01:21:29 +08:00
  • f444eab66c Shuffle fix for gfx950 (#3491) Thomas Ning 2026-01-14 01:21:29 +08:00
  • 00c46785a8 Shuffle fix for gfx950 (#3491) Thomas Ning 2026-01-14 01:21:29 +08:00
  • d45b830804 Merge branch 'develop' into aviralgoel/memory_pipeline_refactor_2 Aviral Goel 2026-01-13 22:04:56 +05:30
  • 97e2b52bf5 Merge commit '9908a87c311352057da5eed93271ce7ea575ad21' into develop assistant-librarian[bot] 2026-01-13 16:17:05 +00:00
  • e40687bfc3 [CK_BUILDER] Add bwd weight factories (#3509) Ville Pietilä 2026-01-13 18:12:38 +02:00
  • 4caaa64c39 [CK_BUILDER] Add bwd weight factories (#3509) Ville Pietilä 2026-01-13 18:12:38 +02:00
  • 9908a87c31 [CK_BUILDER] Add bwd weight factories (#3509) Ville Pietilä 2026-01-13 18:12:38 +02:00
  • f9f3844dd6 Merge remote-tracking branch 'origin/vpietila/ckb-bwd-weight-factories' into vpietila/ckb-refactor-warp-gemm-descriptors Ville Pietilä 2026-01-13 10:54:36 -05:00
  • f9557c3692 Merge commit '710fa1fd3d317839ac9627751279f89ad610f20d' into develop assistant-librarian[bot] 2026-01-13 15:17:57 +00:00
  • 18b676b24c fix incorrect List import in reduce_parameter.py (#3555) Po Yen Chen 2026-01-13 22:33:05 +08:00
  • 83dac7e00f fix incorrect List import in reduce_parameter.py (#3555) Po Yen Chen 2026-01-13 22:33:05 +08:00
  • 710fa1fd3d fix incorrect List import in reduce_parameter.py (#3555) Po Yen Chen 2026-01-13 22:33:05 +08:00
  • 93ff8b07a2 use new pipeline in example Sami Remes 2026-01-13 09:25:13 -05:00
  • faf91267ec Improve dispatcher error messages. Fix builder smoke tests. Ville Pietilä 2026-01-13 08:57:00 -05:00
  • 9b1c8ae951 moved common attributes to helpers Kevin Abraham 2026-01-13 13:39:58 +00:00
  • edd11c9852 Extend comp async pipeline with scales Sami Remes 2026-01-13 06:46:28 -05:00
  • f944bc03fa Extend comp async pipeline with scales Sami Remes 2026-01-13 05:47:55 -05:00
  • 9b58c20e1c Merge remote-tracking branch 'origin/develop' into vpietila/ckb-bwd-weight-factories Ville Pietilä 2026-01-13 04:20:32 -05:00
  • bf57fbf488 clang-format Ville Pietilä 2026-01-13 04:20:08 -05:00
  • a8f1d44078 Make BlockTransferDescriptor concept parametrized. Introduce a common TileTransferParameters concept for conv algorithms. Ville Pietilä 2026-01-13 04:17:06 -05:00
  • 1d519792ca Unify block transfer for fwd and bwd directions. Rename ThreadSliceDim to ThreadClusterRank. Ville Pietilä 2026-01-13 03:19:10 -05:00
  • 7b15c22e7e fixed prefetch stage gemm Kevin Abraham 2026-01-13 08:16:04 +00:00
  • 97793cf352 Unify handling of conv tensor types between fwd and bwd directions. Ville Pietilä 2026-01-13 03:08:01 -05:00
  • b5d060b6b3 Remove old layout and elementwise ops. Ville Pietilä 2026-01-13 02:44:54 -05:00
  • 3e8f3907f2 Unify conv elementwise ops and layout definitions for fwd and bwd directions. Ville Pietilä 2026-01-13 02:43:37 -05:00
  • 7c78c99af9 Merge branch 'develop' into aviralgoel/memory_pipeline_refactor_2 Aviral Goel 2026-01-13 12:42:22 +05:30
  • 25bf808899 Merge commit 'eb041079a36a767ccc8aa9a0a9d0e2822f352f03' into develop assistant-librarian[bot] 2026-01-13 06:17:41 +00:00
  • d69aeffd0d Implement grouped gemm tile loop for RDNA4 (#3304) Erwin Terpstra 2026-01-13 07:14:23 +01:00
  • 18c8824e3c Implement grouped gemm tile loop for RDNA4 (#3304) Erwin Terpstra 2026-01-13 07:14:23 +01:00
  • eb041079a3 Implement grouped gemm tile loop for RDNA4 (#3304) Erwin Terpstra 2026-01-13 07:14:23 +01:00
  • 0d13ef7329 [CK Tile] Fix FMHA LSE calculation and potential division by zero (#3326) Jeff Huang 2026-01-13 13:52:26 +08:00
  • eb143eade0 [CK Tile] Fix FMHA LSE calculation and potential division by zero (#3326) Jeff Huang 2026-01-13 13:52:26 +08:00
  • 141f77aa12 [CK Tile] Fix FMHA LSE calculation and potential division by zero (#3326) Jeff Huang 2026-01-13 13:52:26 +08:00
  • dd7236189c Merge commit 'c9f112b0267625016a58ce3465ee34232c85812b' into develop assistant-librarian[bot] 2026-01-13 04:27:40 +00:00
  • 99b88be5fb [FMHA] Support page_size=1 (linear layout) in batch prefill pipeline (#3545) Jeff Huang 2026-01-13 12:04:43 +08:00
  • 908afb3a55 [FMHA] Support page_size=1 (linear layout) in batch prefill pipeline (#3545) Jeff Huang 2026-01-13 12:04:43 +08:00
  • c9f112b026 [FMHA] Support page_size=1 (linear layout) in batch prefill pipeline (#3545) Jeff Huang 2026-01-13 12:04:43 +08:00