Commit Graph

  • d9e9f19ca4 Run clang-formatting. Ville Pietilä 2025-10-07 13:41:54 +00:00
  • 3c50e984c9 Remove unused code. Ville Pietilä 2025-10-07 13:28:32 +00:00
  • 8eb5c51736 Skeleton for the merged conv groups test. vpietila/bwd-conv-weight-integration-tests Ville Pietilä 2025-10-07 12:39:47 +00:00
  • faf07cc3ab Code clean-up. Ville Pietilä 2025-10-07 10:41:18 +00:00
  • d89f1b510c Fix merge conflicts. Ville Pietilä 2025-10-07 10:32:32 +00:00
  • 9519405b4a Merge remote-tracking branch 'origin/develop' into vpietila/merge-multiple-conv-groups-into-single-wg-in-ck-tile Ville Pietilä 2025-10-07 08:05:32 +00:00
  • f8838d7b38 Refactored Descriptor to not be a template class. John Shumway 2025-10-06 21:14:57 +00:00
  • 98c3368cb6 Merge commit '19415d0b6f7766e0523baad10ef0a53232b1defd' into develop assistant-librarian[bot] 2025-10-06 20:13:14 +00:00
  • d4a4861a3c Clean up conv_description.cpp. John Shumway 2025-10-06 19:50:13 +00:00
  • af9520e598 fix: nil performance results for gemm examples (#2950) Aviral Goel 2025-10-06 15:43:23 -04:00
  • 2a7946bbe8 fix: nil performance results for gemm examples (#2950) Aviral Goel 2025-10-06 15:43:23 -04:00
  • 19415d0b6f fix: nil performance results for gemm examples (#2950) Aviral Goel 2025-10-06 15:43:23 -04:00
  • 81953c81d6 add type-conversion to AccDataType and then to CDataType to exactly mimic GPU's behavior fix_int4_tests Kevin Abraham 2025-09-12 15:35:24 +00:00
  • 710f9bf22d Merge commit 'd4761d7807da0a9205af0e2684e5a1a74e0052ad' into develop assistant-librarian[bot] 2025-10-06 16:13:15 +00:00
  • 626078c473 Fixing hash (#2973) Geo Min 2025-10-06 08:38:38 -07:00
  • 5676c0c028 Fixing hash (#2973) Geo Min 2025-10-06 08:38:38 -07:00
  • d4761d7807 Fixing hash (#2973) Geo Min 2025-10-06 08:38:38 -07:00
  • 7ed033a1b2 changed range for pki4 to -1...1 (-0.5...0.5 never really made sense for i4 anyway and always should have caused compiler errors, but since there was no int4 specialization of GeneratorTensor3 until now, this passed Kevin Abraham 2025-10-01 09:27:16 +00:00
  • 43bfed855d changed range of b_scale test initialization to -1...2 Kevin Abraham 2025-10-06 14:30:43 +00:00
  • 1d866cb75a Add tests for reflection::Description. John Shumway 2025-10-06 13:04:27 +00:00
  • e54cb5a713 intial commit Tianxing Wu 2025-10-06 13:02:38 +00:00
  • 1a4ce740a4 Remove dead code. vpietila/merge-multiple-conv-groups-fully-working-baseline Ville Pietilä 2025-10-06 12:59:10 +00:00
  • a3458d38c9 Remove debug output. Ville Pietilä 2025-10-06 12:58:16 +00:00
  • 9554344c94 Fix tensor descriptors for merged conv groups when K > 1. Ville Pietilä 2025-10-06 12:57:47 +00:00
  • 6bcb5e4ccf Revert removal of bf16 samremes/fmha_192x128_hdim_occupancy Sami Remes 2025-10-06 12:37:32 +00:00
  • a9ed3672d7 Use occupancy=1 for 192x128 head dims Sami Remes 2025-10-06 12:35:34 +00:00
  • 08814d5874 added specialization of GeneratorTensor_3 for int4 and fixed internal overflow Kevin Abraham 2025-08-19 13:03:48 +00:00
  • ea36b9eead changed gemm_b_scale and gemm_universal tests to use correct parameters Kevin Abraham 2025-08-20 13:48:14 +00:00
  • c35aee2b56 ported fixes back to non-batched version of b_scale Kevin Abraham 2025-08-22 13:41:17 +00:00
  • 0cc8891fba removed failing xld instances. Failure now uncovered now that tests were fixed Kevin Abraham 2025-10-05 14:32:28 +00:00
  • f2a0430ce1 Add initial reflection capabilities to the builder. John Shumway 2025-10-06 12:00:26 +00:00
  • 24fe5e4f80 Fix bugs in merged conv groups tensor descriptors. Ville Pietilä 2025-10-06 10:22:23 +00:00
  • 4fdeefa8ff Merge commit '96efe2f4855d643c2f88ff8d67eab6f21461fce1' into develop assistant-librarian[bot] 2025-10-06 10:13:13 +00:00
  • 44e3f18e0e ck tile engine integrate with gemm unit tests (#2601) msaffari-amd 2025-10-06 12:00:58 +02:00
  • 12909118ea ck tile engine integrate with gemm unit tests (#2601) msaffari-amd 2025-10-06 12:00:58 +02:00
  • 96efe2f485 ck tile engine integrate with gemm unit tests (#2601) msaffari-amd 2025-10-06 12:00:58 +02:00
  • e0763e25ce fix instance factory error Enrico Degregori 2025-10-06 07:35:22 +00:00
  • 22b5da468f Update README.md to align with the Algorithm concept. John Shumway 2025-10-02 00:51:13 +00:00
  • 816be4c417 Add placeholder README.md file John Shumway 2025-10-01 14:24:39 +00:00
  • 2d5311107f Fix and document the inlineDiff function. John Shumway 2025-09-30 17:11:39 +00:00
  • 089b5c6ffa Add StringEqWithDiff matcher. John Shumway 2025-09-30 16:30:15 +00:00
  • 86240357d4 Enable gmock in gtest.cmake. John Shumway 2025-09-30 14:07:38 +00:00
  • 2093e4e5b9 Add documentation to conv_signature.hpp. John Shumway 2025-09-25 15:37:47 +00:00
  • a30c9c362c Add color to inlineDiff test util. John Shumway 2025-09-23 15:38:27 +00:00
  • a40e1e7692 Add testing utils. John Shumway 2025-09-19 15:40:24 +00:00
  • 7d27c8663a Remove broken gmock and fix inline diff. John Shumway 2025-09-19 12:03:40 +00:00
  • b4bb2bf317 Format file and enable gmock. John Shumway 2025-09-18 22:38:40 +00:00
  • 9f65631f00 Add test_conv_bwd_instances.cpp. John Shumway 2025-09-18 17:50:34 +00:00
  • 8b29be6785 Rename test_conv_fwd_instances.cpp. John Shumway 2025-09-18 17:24:48 +00:00
  • 5878c32c14 Clean up factory for backwards convolutions. John Shumway 2025-09-18 05:12:20 +00:00
  • 1adb40d5c8 Add backward instance to the builder. John Shumway 2025-09-17 13:17:19 +00:00
  • 2890584ef1 Improve bulder_utils. John Shumway 2025-09-14 14:33:48 +00:00
  • d771863a7d Fix naming style of all signature enums. John Shumway 2025-09-14 14:25:47 +00:00
  • 9c89f5eaf9 Provide default value for API VERSION. John Shumway 2025-09-14 14:16:40 +00:00
  • 24341a3fb8 Rename layouts to channels first or channels last. John Shumway 2025-09-14 14:03:35 +00:00
  • 33721db424 Add a 3D kernel instatiation. John Shumway 2025-09-14 00:14:55 +00:00
  • 1bc0a1281d Rename Factory template. John Shumway 2025-09-13 22:31:08 +00:00
  • 934cea8511 Fix capitalization of Builder::Factory type. John Shumway 2025-09-13 22:28:51 +00:00
  • 79f179c1a8 Add support and tests for different type. John Shumway 2025-09-13 22:25:22 +00:00
  • 5f0c272c9f Simplify Signature by removing constexpr. John Shumway 2025-09-13 14:00:49 +00:00
  • 432e29026c Add missing files. John Shumway 2025-09-13 13:47:41 +00:00
  • e4a93ba12a Fix concepts for convolution signature. John Shumway 2025-09-08 19:56:33 +00:00
  • da140c434b Update some more concept names. John Shumway 2025-09-07 19:35:54 +00:00
  • 32b4b27031 Describe the convolution instances tests. John Shumway 2025-09-07 19:28:17 +00:00
  • afbc4223ae Update concept names. John Shumway 2025-09-07 18:40:33 +00:00
  • 00741a7266 Use set_thread_cluster_dims helper. John Shumway 2025-09-06 17:12:26 +00:00
  • 1f986b9192 Add an alias to make enum more readable. John Shumway 2025-09-06 16:51:23 +00:00
  • 8b540c8df1 Use a set_submatrix helper. John Shumway 2025-09-06 16:46:57 +00:00
  • 5da397b9ec Add all device_grouped_conv_fwd_xdl_bf16_comp_instances John Shumway 2025-09-05 23:46:09 +00:00
  • cd1c1e0aff Add block GEMM pipeline version to builder. John Shumway 2025-09-05 23:14:02 +00:00
  • b2f501d5d7 Generalized version to StringLiteral. John Shumway 2025-09-05 22:30:00 +00:00
  • 3c020eb507 Update builder_utils.hpp. John Shumway 2025-09-05 20:41:32 +00:00
  • 0d8724a162 Convert SIGNATURE to non-template type parameter. John Shumway 2025-09-04 21:31:09 +00:00
  • 349b2febc8 Add two more instances to tests. John Shumway 2025-09-04 14:46:04 +00:00
  • 70415c2c16 Split builder tests and instance tests. John Shumway 2025-09-04 14:02:34 +00:00
  • 7acd24ef20 Migrate builder instantiation test to a TYPED_TEST_SUITE. John Shumway 2025-09-04 13:34:22 +00:00
  • 6a513e1a7f Add block transfer paramters to builder. John Shumway 2025-09-02 23:08:32 +00:00
  • 97660c64e5 Add test for ak1 and bk1. John Shumway 2025-09-02 21:06:16 +00:00
  • 834f0436a3 Making alorithm a non-type parameter John Shumway 2025-09-02 17:22:29 +00:00
  • f8b790dfd1 Add tuning parameters to builder. John Shumway 2025-09-02 16:32:32 +00:00
  • c0f5f5a20e Simplify convolution builder tests. John Shumway 2025-09-01 22:04:04 +00:00
  • 061fb06eef Add thread block info to factory. John Shumway 2025-09-01 21:54:41 +00:00
  • cee90b800e Fix test files for convolution builder. John Shumway 2025-08-28 02:01:46 +00:00
  • 897f966df6 Initial commit of convolution builder. John Shumway 2025-08-28 01:48:42 +00:00
  • c29a8ab871 Merge commit '58983a323287d41dff8b37c5318942d7159559dc' into develop assistant-librarian[bot] 2025-10-03 20:12:47 +00:00
  • ad73165d72 [TheRock CI] Bumping hash for TheRock (#2972) Geo Min 2025-10-03 12:50:16 -07:00
  • 8580b33f32 [TheRock CI] Bumping hash for TheRock (#2972) Geo Min 2025-10-03 12:50:16 -07:00
  • 58983a3232 [TheRock CI] Bumping hash for TheRock (#2972) Geo Min 2025-10-03 12:50:16 -07:00
  • a3698dab8d Merge commit 'b4a4aa2b64a7a94ab04126545a3dc4f6d3eba847' into develop assistant-librarian[bot] 2025-10-03 17:11:09 +00:00
  • 1acf95ad91 [CK Tile] CShuffle Tile Permute N all warp compatible (#2966) Thomas Ning 2025-10-03 09:46:13 -07:00
  • be09203966 [CK Tile] CShuffle Tile Permute N all warp compatible (#2966) Thomas Ning 2025-10-03 09:46:13 -07:00
  • b4a4aa2b64 [CK Tile] CShuffle Tile Permute N all warp compatible (#2966) Thomas Ning 2025-10-03 09:46:13 -07:00
  • 78ea3ef1a9 fix clang format Enrico Degregori 2025-10-03 15:14:43 +00:00
  • eed91c3984 change block_sync_lds function to be consistent with gfx12 path in Old CK radeon-ai/optimised-block_sync_lds Philip Maybank 2025-10-03 10:50:14 -04:00
  • 970f037606 Add more instances. Ville Pietilä 2025-10-03 14:39:31 +00:00
  • 48d22d2b9b Remove the obsolete template parameters. Ville Pietilä 2025-10-03 14:36:48 +00:00
  • 99fe3df99a Fix tensor descriptors. Ville Pietilä 2025-10-03 14:23:04 +00:00
  • 44f405c6c1 Merge commit '4c98535456c468cbd36d39de4a92406fa3a012b6' into develop assistant-librarian[bot] 2025-10-03 14:11:48 +00:00
  • 594b16ea3c fix compilation errors on RHEL8 and SLES15 (#2967) Illia Silin 2025-10-03 07:08:49 -07:00
  • f1efeaa564 fix compilation errors on RHEL8 and SLES15 (#2967) Illia Silin 2025-10-03 07:08:49 -07:00