Commit Graph

  • c28e65c6bf [CK_TILE] Add mxfp4 flatmm (#3080) Yi DING 2025-10-31 11:29:05 +08:00
  • acec30dd09 [CK_TILE] Add mxfp4 flatmm (#3080) Yi DING 2025-10-31 11:29:05 +08:00
  • e135dd518d [CK_TILE] Add mxfp4 flatmm (#3080) Yi DING 2025-10-31 11:29:05 +08:00
  • bcccafee40 Update tile distribution for 2D bquant Cong Ma 2025-10-30 22:50:57 -04:00
  • c41df57bad Merge commit 'b387249fd905b595f2d38ac2a18d8c2aa9b88c00' into develop assistant-librarian[bot] 2025-10-31 00:35:07 +00:00
  • 527b718fa7 formatted khuagarw 2025-10-31 00:01:28 +00:00
  • 51b6f6fe7d [CK_BUILDER] Generalize convolution factory to build arbitrary device operations. (#3116) Ville Pietilä 2025-10-31 01:13:58 +02:00
  • 2ff4da7949 [CK_BUILDER] Generalize convolution factory to build arbitrary device operations. (#3116) Ville Pietilä 2025-10-31 01:13:58 +02:00
  • b387249fd9 [CK_BUILDER] Generalize convolution factory to build arbitrary device operations. (#3116) Ville Pietilä 2025-10-31 01:13:58 +02:00
  • ab856a3e02 Merge commit '90da26ccfd16ef0ba92a80d2044239636eab91ef' into develop assistant-librarian[bot] 2025-10-30 23:12:31 +00:00
  • b08404dfa9 [CK_BUILDER] Rename CK Builder test targets with consistent prefix test_ckb (#3114) Ville Pietilä 2025-10-31 01:08:32 +02:00
  • f65d76ed37 [CK_BUILDER] Rename CK Builder test targets with consistent prefix test_ckb (#3114) Ville Pietilä 2025-10-31 01:08:32 +02:00
  • 90da26ccfd [CK_BUILDER] Rename CK Builder test targets with consistent prefix test_ckb (#3114) Ville Pietilä 2025-10-31 01:08:32 +02:00
  • c27f898eed push stream to kernel maker Max Podkorytov 2025-10-30 12:41:06 -05:00
  • fb5d115c34 Merge commit '22d9f9994228b84fe79340292726ab840207552f' into develop assistant-librarian[bot] 2025-10-30 17:12:23 +00:00
  • 9d81a24df8 Fixed building CK Tile grouped conv fwd bias clamp example. (#3124) Ville Pietilä 2025-10-30 18:17:48 +02:00
  • 0a49238dd5 Fixed building CK Tile grouped conv fwd bias clamp example. (#3124) Ville Pietilä 2025-10-30 18:17:48 +02:00
  • 22d9f99942 Fixed building CK Tile grouped conv fwd bias clamp example. (#3124) Ville Pietilä 2025-10-30 18:17:48 +02:00
  • e7773919cc Merge commit '254bce934626502ecc043ff1ffb8e9609d8299d6' into develop assistant-librarian[bot] 2025-10-30 13:21:17 +00:00
  • e7696eeff1 tmp save vpietila/ck-vs-ck-tile-conv-benchmarking Jakub Piasecki 2025-10-30 13:17:06 +00:00
  • ea4206f11e Sync release/rocm-rel-7.1 into docs/7.1.0 alexxu-amd 2025-10-30 09:06:51 -04:00
  • bcd00317f9 Lwpck 3550: Implement and test fixed precision fp8 x bf8 (#2963) SamiAario-AMD 2025-10-30 14:36:10 +02:00
  • d76e2879d0 Lwpck 3550: Implement and test fixed precision fp8 x bf8 (#2963) SamiAario-AMD 2025-10-30 14:36:10 +02:00
  • 254bce9346 Lwpck 3550: Implement and test fixed precision fp8 x bf8 (#2963) SamiAario-AMD 2025-10-30 14:36:10 +02:00
  • fd61987d73 [CK_TILE] Improve grouped conv kernel name generation (#3028) Ville Pietilä 2025-10-30 14:19:07 +02:00
  • 4694b1b4a7 [CK_TILE] Improve grouped conv kernel name generation (#3028) Ville Pietilä 2025-10-30 14:19:07 +02:00
  • 9ee9f4d2a3 [CK_TILE] Improve grouped conv kernel name generation (#3028) Ville Pietilä 2025-10-30 14:19:07 +02:00
  • eaf9650fed Use separate pipelines for using or not-using softmax situations Qianfeng Zhang 2025-10-30 08:01:22 +00:00
  • 11514c7775 Merge branch 'develop' into ck_tile_batched_contraction_kernel_generelizing msaffari-amd 2025-10-30 10:38:18 +01:00
  • 68e41da5f2 fix formatting Sami Remes 2025-10-30 08:48:39 +00:00
  • ef6e1866ff Merge commit '8c4cb4f9f4d3e96813c8dd5b26e175c169d14a9c' into develop assistant-librarian[bot] 2025-10-30 03:31:18 +00:00
  • e3d1fc26b6 Jimniu/ ck tile gemm stride validation (#2710) Jimniu 2025-10-29 22:45:09 -04:00
  • f5200e06c6 Jimniu/ ck tile gemm stride validation (#2710) Jimniu 2025-10-29 22:45:09 -04:00
  • 8c4cb4f9f4 Jimniu/ ck tile gemm stride validation (#2710) Jimniu 2025-10-29 22:45:09 -04:00
  • 6b08da83ab try encapsulating the kernel instantiation guts Max Podkorytov 2025-10-29 17:45:47 -05:00
  • de1ee4af17 Merge commit '1e77695fe87c4d4d979859a91f29fd29aebbbcbc' into develop assistant-librarian[bot] 2025-10-29 21:11:55 +00:00
  • 220bd7a9bb [CK_TILE] Support WMMA (gfx12) in FMHA (#2528) Anton Gorenko 2025-10-30 01:31:08 +05:00
  • 9a012c3135 [CK_TILE] Support WMMA (gfx12) in FMHA (#2528) Anton Gorenko 2025-10-30 01:31:08 +05:00
  • 1e77695fe8 [CK_TILE] Support WMMA (gfx12) in FMHA (#2528) Anton Gorenko 2025-10-30 01:31:08 +05:00
  • 6ad4f93cbf Merge pull request #3118 from spolifroni-amd/users/spolifroni-amd/CK-cherry-pick-doc-7.1 spolifroni-amd 2025-10-29 14:27:25 -04:00
  • 1747ce86f0 fixed the contrib guide spolifroni-amd 2025-09-09 15:24:44 -04:00
  • 067fd96c32 first commit of the glossary (#2702) spolifroni-amd 2025-09-08 13:55:32 -04:00
  • 306e25a27d only enable the working group sizes in tests Sami Remes 2025-10-29 17:32:58 +00:00
  • 1290b1b28a simplify conditions that are needed for tile distributions Sami Remes 2025-10-29 17:22:37 +00:00
  • 0eb1b551b7 Optimize batched contraction example: pass dimension sizes not vectors Mohsen Saffari 2025-10-29 16:33:46 +00:00
  • e7f5f0b82a Clean up batched contraction: remove legacy paths and finalize docs Mohsen Saffari 2025-10-29 16:00:12 +00:00
  • 207e6f10b8 Implementation of hstu attention pipeline using trload for v on mi350 Qianfeng Zhang 2025-10-27 14:54:36 +00:00
  • 26e9ec020f Merge commit 'cafaeb6b7bac4e18b0a5341cd14f54224292a0c9' into develop assistant-librarian[bot] 2025-10-29 15:12:59 +00:00
  • 670409c8f0 merge develop Mohsen Saffari 2025-10-29 15:11:38 +00:00
  • 361a4c6e23 Add instance traits for two more grouped forward convolutions (#3112) John Shumway 2025-10-29 08:04:13 -07:00
  • 2f0242c5ab Add instance traits for two more grouped forward convolutions (#3112) John Shumway 2025-10-29 08:04:13 -07:00
  • cafaeb6b7b Add instance traits for two more grouped forward convolutions (#3112) John Shumway 2025-10-29 08:04:13 -07:00
  • 88910537bf [CK_Tile] Merge multiple convolution groups into a single GEMM batch (#2986) Ville Pietilä 2025-10-29 16:49:28 +02:00
  • abccb649d1 [CK_Tile] Merge multiple convolution groups into a single GEMM batch (#2986) Ville Pietilä 2025-10-29 16:49:28 +02:00
  • 121bf0e1f3 [CK_Tile] Merge multiple convolution groups into a single GEMM batch (#2986) Ville Pietilä 2025-10-29 16:49:28 +02:00
  • fbdded6927 Use the device op signature for validation. vpietila/ckb-generaized-conv-factory-baseline Ville Pietilä 2025-10-29 14:44:29 +00:00
  • 74ba32ea58 Add predicates for all device op instances. Ville Pietilä 2025-10-29 14:39:03 +00:00
  • df90bcbfd0 Added failure pattern check (#3111) andrew clark 2025-10-29 08:19:56 -06:00
  • 332a0e1696 Added failure pattern check (#3111) andrew clark 2025-10-29 08:19:56 -06:00
  • aa22da07be Added failure pattern check (#3111) andrew clark 2025-10-29 08:19:56 -06:00
  • e5eb4edd1a Add device operation to conv signature. Use unions to hold conv layouts and device operations. Ville Pietilä 2025-10-29 13:29:48 +00:00
  • e391b9d659 Add listing of all fwd and bwd device ops and instances. Ville Pietilä 2025-10-29 13:32:03 +00:00
  • ad8fca0253 Add device operation to conv signature. Use unions to hold conv layouts and device operations. Ville Pietilä 2025-10-29 13:29:48 +00:00
  • 237363809d update coherence yanda/wip_355 zanzhang 2025-10-29 20:35:25 +08:00
  • bd7124f00e update valarLip 2025-10-29 09:42:17 +00:00
  • 5e0a356e19 Remove commented code Sami Remes 2025-10-29 11:19:02 +02:00
  • 83b2a1d876 Merge commit '66bae4306cb2bebfe234fa689ee1d9048f2efa67' into develop assistant-librarian[bot] 2025-10-29 09:13:57 +00:00
  • cd30313161 Grouped conv fwd with direct load (#3082) Bartłomiej Kocot 2025-10-29 09:54:42 +01:00
  • 801546f608 Grouped conv fwd with direct load (#3082) Bartłomiej Kocot 2025-10-29 09:54:42 +01:00
  • 66bae4306c Grouped conv fwd with direct load (#3082) Bartłomiej Kocot 2025-10-29 09:54:42 +01:00
  • 205a31c693 Rename CK Builder test targets with consistent prefix test_ckb. Ville Pietilä 2025-10-29 08:42:26 +00:00
  • 6f6c855c0e Merge commit '3052d7c9e6972d5ea7d2225ab78e45554ba70efd' into develop assistant-librarian[bot] 2025-10-29 08:15:15 +00:00
  • 201463f036 [CK_TILE] Add indexing to pooling operator (Lwpck 3892) (#3013) Yashvardhan Agarwal 2025-10-29 09:58:04 +02:00
  • edea16ce14 [CK_TILE] Add indexing to pooling operator (Lwpck 3892) (#3013) Yashvardhan Agarwal 2025-10-29 09:58:04 +02:00
  • 3052d7c9e6 [CK_TILE] Add indexing to pooling operator (Lwpck 3892) (#3013) Yashvardhan Agarwal 2025-10-29 09:58:04 +02:00
  • e571490afc Merge commit '7c6430eca04e62454217630ae2a0bbd70ff50a00' into develop assistant-librarian[bot] 2025-10-29 07:13:01 +00:00
  • ac03aee245 [CK_TILE] fmha: Add query padding support to backward pass (#3097) Jeff Huang 2025-10-29 13:56:11 +08:00
  • 9ad15a658c [CK_TILE] fmha: Add query padding support to backward pass (#3097) Jeff Huang 2025-10-29 13:56:11 +08:00
  • 7c6430eca0 [CK_TILE] fmha: Add query padding support to backward pass (#3097) Jeff Huang 2025-10-29 13:56:11 +08:00
  • 7b759ce7e9 Merge commit '13e13ce359d8f21240a056addaca0bfb4fcd2f90' into develop assistant-librarian[bot] 2025-10-29 05:13:07 +00:00
  • da11c84803 [CK_BUILDER] Clean-up fwd conv builder implementation (#3110) Ville Pietilä 2025-10-29 05:37:33 +02:00
  • dc1cd3df0c [CK_BUILDER] Clean-up fwd conv builder implementation (#3110) Ville Pietilä 2025-10-29 05:37:33 +02:00
  • 13e13ce359 [CK_BUILDER] Clean-up fwd conv builder implementation (#3110) Ville Pietilä 2025-10-29 05:37:33 +02:00
  • e1475d4a52 one more fix to tile dstr, and revert debug initialization Sami Remes 2025-10-28 19:13:16 +00:00
  • 7c93551878 fix formatting Sami Remes 2025-10-28 18:55:51 +00:00
  • e12ab566e8 Fix some issues from the merge Sami Remes 2025-10-28 18:55:17 +00:00
  • d79c92527f Merge commit '515e28309153ae8ab6fa3cbed81b44e2c01c43cd' into develop assistant-librarian[bot] 2025-10-28 18:16:26 +00:00
  • bdc7c919f1 Merge branch 'develop' into philipm/documentation-cleanup-5 philipm/documentation-cleanup-5 Illia Silin 2025-10-28 11:03:24 -07:00
  • 7be2eed5c2 [CK_TILE] Top-K with Sigmoid kernel (#3062) Sami Remes 2025-10-28 17:54:06 +00:00
  • 39e77ae650 [CK_TILE] Top-K with Sigmoid kernel (#3062) Sami Remes 2025-10-28 17:54:06 +00:00
  • 515e283091 [CK_TILE] Top-K with Sigmoid kernel (#3062) Sami Remes 2025-10-28 17:54:06 +00:00
  • a449728fdd Merge remote-tracking branch 'origin/develop' into samremes/bmatrix_2d_blockscale Sami Remes 2025-10-28 17:49:53 +00:00
  • 0506549847 [CK_BUILDER] Factory tests (#3071) Robin Voetter 2025-10-28 18:27:42 +01:00
  • b09d802931 [CK_BUILDER] Factory tests (#3071) Robin Voetter 2025-10-28 18:27:42 +01:00
  • 6f58d6e457 [CK_BUILDER] Factory tests (#3071) Robin Voetter 2025-10-28 18:27:42 +01:00
  • 78d7289839 Add option to build ckProfiler packages for individual architectures. (#3105) Illia Silin 2025-10-28 09:48:11 -07:00
  • 96f8b985b7 Add option to build ckProfiler packages for individual architectures. (#3105) Illia Silin 2025-10-28 09:48:11 -07:00
  • 155d63f4fe Add option to build ckProfiler packages for individual architectures. (#3105) Illia Silin 2025-10-28 09:48:11 -07:00
  • 40c4bad35b [CK][Examples] Fix for example_grouped_gemm_multiple_d_dl_fp16 - corrected stride for B matrix. (#3104) Michał Kulikowski 2025-10-28 17:47:25 +01:00
  • cd5eeca2b0 [CK][Examples] Fix for example_grouped_gemm_multiple_d_dl_fp16 - corrected stride for B matrix. (#3104) Michał Kulikowski 2025-10-28 17:47:25 +01:00