composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-11 17:00:18 +00:00

Author	SHA1	Message	Date
Illia Silin	4a5ab67871	Enable CI on gfx12. (#1454 ) * enable CI build and test on gfx1201 * skip DL kernels in CI for gfx12 * only run CI on gfx12 if rocm version >= 6.2 * remove the rocm version check for CI on gfx12 * add a switch for CI builds on gfx12	2024-08-08 16:29:15 -07:00
Illia Silin	ae3b8ff86c	check if the coerce-illegal-types flag is supported (#1451 )	2024-08-08 07:29:29 -07:00
Illia Silin	8a75728406	add rocm-llvm-dev package to docker image (#1452 )	2024-08-08 07:29:13 -07:00
Juan Manuel Martinez Caamaño	901e5f1540	Remove reinterpret_cast uses that result in undefined behaviour. (#1445 ) * Remove reinterpret_cast uses that result in undefined behaviour. Use a bitcast instead. See https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_accessibility Closes #1439 * fix clang format --------- Co-authored-by: illsilin <Illia.Silin@amd.com>	2024-08-07 11:49:02 -07:00
Illia Silin	5df10432d8	upgrade to rocm6.2 as new default compiler (#1448 )	2024-08-07 09:38:43 -07:00
dependabot[bot]	a71d407e35	Bump rocm-docs-core from 1.6.1 to 1.6.2 in /docs/sphinx (#1449 ) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.6.1 to 1.6.2. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.6.1...v1.6.2) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-07 08:22:38 -07:00
Illia Silin	12c1f68dd9	Run CK_TILE FMHA benchmarks and collect the performance data. (#1447 ) * run ck_tile benchmarks after the smoke tests and store logs * change the path of fmha benchmark logs * change the way of stashig ck_tile fmha logs * prevent the errors in stages where no logs are generated * fix the ck_tile fmha log names and headers * generate the fmha performance logs in the root folder * change jenkins scrip arguments format * use exact file names for stashing * modify scripts to process FMHA performance results * unstash FMHA logs before parsing them	2024-08-07 08:18:26 -07:00
Max Podkorytov	886d14ccb2	modify python wrapper for addmm (#1441 )	2024-08-06 15:09:27 -07:00
Haocong WANG	6fc7bff58f	Limit fp8only operator build arch in ckProfiler (#1443 )	2024-08-06 14:29:14 -07:00
Jun Liu	afbf6350f3	Fix ROCm 6.2 compiler not fully supporting gfx12 when building CK with INSTANCES_ONLY (#1446 )	2024-08-06 13:06:53 -07:00
Juan Manuel Martinez Caamaño	fd9ef4e678	Add missing constexpr to if conditions (#1444 )	2024-08-06 11:40:34 -07:00
bibek	840c5397bb	adding mha as static lib (#1366 ) * adding mha as static lib * add fmha fwd compile options * typo * fix python version * python version to 3 * increase path length * add max path flag in mha cmake * fix long path issue * mha currently only runs in gfx94x * only buld mha in mi300 * populate gpu_list * add mha compile flags * avoid building mha in gpu other then gfx94x * some comments and include ck_tile in rocm * use rocm_install * place ck_tile in include * correct ck_tile path --------- Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>	2024-08-06 11:17:10 -05:00
jakpiase	b74d4d4d54	Fix for beta!=0 in reduce (#1440 ) * fix for beta!=0 in reduce * add reviewers suggestions	2024-08-06 09:10:39 -07:00
Bartłomiej Kocot	4ec5c52a0c	Add Grouped Conv Fwd Large Tensor kernel (#1432 ) * Support 64 bit indexing * Add new grouped conv fwd kernel for large tensors * Add instances large tensor * Fixes for transform conv to gemm * Fixes * fixes * Remove not needed instances * examples fixes * Remove not need ds arrays * Fix tests * Add 2GB check in gridwise dl * Fixes	2024-08-06 10:06:10 +02:00
Illia Silin	7f57b2e02c	add --offload-compress compiler flag (#1433 ) * add --offload-compress compiler flag * only apply the --offload-compress flag to the ckProfiler * move the --offload-compress flag back to main cmake file * add offload-compress to target compile option of ckProfiler --------- Co-authored-by: carlushuang <carlus.huang@amd.com>	2024-08-05 23:26:01 +08:00
Illia Silin	f31ba04afc	[CI][Jenkins] delete CI docker container upon exit (#1437 )	2024-08-05 08:13:56 -07:00
Illia Silin	d311c95396	Add compiler flags for ROCm versions 6.2+ (#1429 ) * add compiler flags to fix compiler issues * fix typo. * disable test_smfmac_op on all devices except gfx942 * specify full path to compiler in CI	2024-08-01 08:27:52 -07:00
Sam Wu	6648fd3b04	Update doc requirements (#1423 )	2024-07-31 07:42:42 -07:00
zjing14	f31e8dfa80	[HotFix] Fixed a typo in profile_gemm_multiply_multiply (#1425 ) * fixed a typo * clean --------- Co-authored-by: Jing Zhang <jizhan@fb.com>	2024-07-31 07:19:17 -07:00
arai713	d32997a792	Codegen: isSupportedArgument check (#1417 ) * added isSupportedArgument check into codegen device op * adding function call * remove commented code	2024-07-31 07:12:15 -07:00
carlushuang	b3f86e79dd	workaround rocm-6.2 compiler issue (#1421 )	2024-07-31 16:03:59 +08:00
Illia Silin	b527cad4a5	add docker for rocm6.2_rc4 compiler (#1424 )	2024-07-30 11:55:33 -07:00
Bartłomiej Kocot	33b399cc15	Revert Revert Support access per groups and filter2x3 in grouped conv fwd (#1382 ) (#1406 ) (#1415 )	2024-07-30 18:36:04 +02:00
dependabot[bot]	b9ba5b2676	Bump rocm-docs-core from 1.6.0 to 1.6.1 in /docs/sphinx (#1420 ) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.6.0 to 1.6.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.6.0...v1.6.1) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-26 14:47:19 -07:00
trixirt	733f33af78	Introduce cmake USE_GLIBCXX_ASSERTIONS option (#1404 ) A standard option in Fedora packaging that is used to check the correctness of c++ use of the standard c++ library. Signed-off-by: Tom Rix <trix@redhat.com> Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>	2024-07-25 19:28:17 -07:00
zjing14	105bd708c7	Add rotating buff for gemm_multi_d (#1411 ) * add rotating_buff for gemm_multi_d * format * Update flush_cache.hpp * Update gtest.cmake --------- Co-authored-by: Jing Zhang <jizhan@fb.com> Co-authored-by: Haocong WANG <haocwang@amd.com>	2024-07-25 23:21:21 +08:00
dependabot[bot]	1208082e53	Bump rocm-docs-core from 1.5.1 to 1.6.0 in /docs/sphinx (#1416 ) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.5.1 to 1.6.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.1...v1.6.0) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-24 22:56:29 -07:00
Andriy Roshchenko	4a8a1befd5	Adding more instances of grouped convolution 3d forward for FP8 with ConvScale+Bias element-wise operation. (#1412 ) * Add CMakePresets configurations. * Add binary elementwise ConvScaleAdd and an example. * Numerical verification of results. Observed significant irregularities in F8 to F32 type conversions: ```log ConvScaleAdd: float=145.000000 f8_t=160.000000 e=144.000000 ConvScaleAdd: float=97.000000 f8_t=96.000000 e=104.000000 ConvScaleAdd: float=65.000000 f8_t=64.000000 e=72.000000 ``` * Implemented ConvScaleAdd + Example. * Add ConvScale+Bias Instances * Add Client Example for ConvScale+Bias * Fix number of bytes in an example.. * Cleanup.	2024-07-24 15:49:55 -05:00
Bartłomiej Kocot	ffabd70a15	Add support for half_t and bfloat to reduction operations (#1395 ) * Add support for half_t and bfloat to reduction operations * Fix bhalf convert * Next fix bf16	2024-07-24 12:12:37 -05:00
dependabot[bot]	33b2a2bdf5	Bump rocm-docs-core from 1.5.0 to 1.5.1 in /docs/sphinx (#1414 ) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.5.0 to 1.5.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.0...v1.5.1) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-24 07:10:50 -07:00
Haocong WANG	d22713a719	disable bad instance (#1410 )	2024-07-23 09:05:03 -07:00
Bartłomiej Kocot	5d8c3d8190	Revert Support access per groups and filter2x3 in grouped conv fwd (#1382 ) (#1406 )	2024-07-22 14:21:24 +02:00
Haocong WANG	8c90f25be3	[GEMM] F8 GEMM, performance optimized. (#1384 ) * add ab_scale init support * enabled interwave * add scale type; update isSupport * adjust example * clean * enable f8 pure gemm rcr ckprofiler * Add gemm_multiply_multiply instances * clang format * Optimize for ScaleBlockMNK=128 * enable abscale f8 gemm ck profiler * Add pure f8 gemm test suite * Reverting to the state of project at `f60fd77` * update copyright * clang format * update copyright --------- Co-authored-by: root <jizhan@amd.com>	2024-07-19 22:06:52 +08:00
ltqin	c544eb4da0	Universal gemm splitk using reduce (with multi-d) (#1341 ) * init for reduce_threadwise multi_d * add reduce_threadwise_multi_d * add reduce_multi_d * clean * start add an other splitk device op * add reduce template parameter to SplitKBatchOffset * add reduce c matrix * clean up code * change example data type to bf16 * add bf16Ai8B example * remove reduce template parameter * add splitk atomic status to v4 * example add multi d parameters * device op add multi-d parameters * add multi-d to reduce * fix kbach=1 bug * change B layout to col in bf16Ai8B example * remove float adding struct * change multi-d interface * change file and class name * remove multi-d of bf16Ai8B example * change IsReduce function to IsReduceAdd * change example layout to RRR from RCR * according layout to set ds stride * reset parameter layout * add gemm universal reduce instance * add reduce factory * add profile_gemm_universal_reduce * add reduce to profiler * fix reduce instance * fix profiler reduce compiling bug * format * format library instance code * add mem instance for reduce library * fix call instance names * add workspace for reduce in ckProfiler * format * add mnpading to reduce library instance * add fp16 instance to reduce of profiler * change copyright time * restore profiler cmake file * add reduce text to instances * add DsLayout and DsDataType to instances template parameter * fixed gemm_reduce_multi_d * add an example without multi_d * Update common.hpp * Update gtest.cmake * Update gemm_xdl_splitk_reduce_bf16.cpp * clean * Update gtest.cmake * format * fixe api * format * default parameter change to RRR * add vector_len for multi_d * format * Update gtest.cmake * fix bf16A iBB elementwiseop * add ReduceDataType * move ReduceDataType to end position * format * remove googletest git method address * fix copyright time * update init data --------- Co-authored-by: root <jizhan@amd.com> Co-authored-by: letaoqin <letaoqin@amd.com> Co-authored-by: Jing Zhang <jizhan@meta.com> Co-authored-by: zjing14 <zhangjing14@gmail.com>	2024-07-19 22:01:22 +08:00
Bartłomiej Kocot	70a814f163	Refactor transform conv to gemm fwd (#1391 ) * Refactor transform conv to gemm fwd * fixes codegen * wmma fixes * fix wmma * Fix copyright	2024-07-19 09:29:25 +02:00
Illia Silin	ab250afda0	add docker for rocm6.2_rc3 (#1401 )	2024-07-18 09:41:33 -07:00
Qianfeng	ee768148f0	Replace the using of __expf by __ocml_exp_f32 to work-around the test_softmax_rank4 failure (#1394 )	2024-07-17 09:15:05 -07:00
Mateusz Ozga	9cac282793	An option whether to colorize output during build (#1390 )	2024-07-16 09:52:44 -07:00
Illia Silin	4c3107fdcb	[ASAN builds] Modify the list of default targets for ASAN builds. (#1389 ) * add a build parameter to build only XNACK targets * use ENABLE_ASAN_PACKAGING flag to set targets for ASAN builds --------- Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>	2024-07-16 09:19:23 -07:00
Andriy Roshchenko	802a8a1df1	Adding more instances of grouped convolution 3d forward for FP8 with ConvScale element-wise operation and ReLU activation. (#1386 ) * Add CMakePresets configurations. * Add ConvScale+ReLU Functor and an Example * Account for ReLU FLOPs. * Add instances of 3D convolutions with ConvscaleRelu operation. * Implement Client Example * Cleanup	2024-07-16 08:51:49 -07:00
Haocong WANG	1ff4f25138	Disbale failed instance in rocm6.2 rel (#1388 )	2024-07-16 08:46:48 -07:00
Illia Silin	eca39050c6	add Rosty and Bartek to code owners (#1392 )	2024-07-16 23:44:46 +08:00
Bartłomiej Kocot	82e8a78a3f	Support access per groups and filter3x3 in grouped conv fwd (#1382 ) * Support access per groups and filter3x3 in grouped conv fwd * Fixes for large cases * Fixes for large tensors	2024-07-12 11:08:42 -07:00
zjing14	13c1e64daa	add gemm_bias_add example (#1361 ) * add gemm_bias_add example * changed strideD * clang-format --------- Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>	2024-07-11 18:08:07 -07:00
Rostyslav Geyyer	7a46a91c84	Add instances for grouped conv fwd 3d with ConvScale for bf8@fp8->fp8 (#1369 ) * Add an example * Add instances * Add a client example	2024-07-11 13:31:39 -07:00
Illia Silin	98a01bbc72	Add CK_TILE tests to daily CI builds. (#1381 ) * add ck_tile tests to CI * build and run ck_tile tests on gfx90a and gfx942 in parallel * fix groovy syntax * turn ck_tile tests OFF by default * skip creating the build folder * build ck_tile examples with 64 threads * build ck_tile examples with cmake-ck-dev.sh script * add video group to docker on mi300 * do not retry to rebuild the early CI stages * help prevent jenkins false failure * restore cron trigger	2024-07-11 13:22:40 -07:00
Illia Silin	f914c228c6	[Jenkins] restore cron jobs (#1380 ) * test the cron trigger * fix the cron jobs * restore the list of cron jobs	2024-07-11 10:28:11 -07:00
Illia Silin	a8eb872055	[gfx12] add gfx12 to the default target list (#1379 )	2024-07-10 14:54:04 -07:00
Sam Wu	860f957c22	Update changelog release headers (#1378 ) * Update doc codeowner syntax * Add doc link to changelog * Update changelog formatting for markdownlint Also change headings for releases	2024-07-10 09:36:10 -06:00
dependabot[bot]	da42a88964	Bump rocm-docs-core from 1.4.1 to 1.5.0 in /docs/sphinx (#1374 ) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.4.1 to 1.5.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.4.1...v1.5.0) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>	2024-07-09 12:48:23 -07:00

1 2 3 4 5 ...

1367 Commits