Commit Graph

906 Commits

Author SHA1 Message Date
Paul
cddcb85659 Fix relative path 2023-05-25 18:16:04 -05:00
Paul
f89f3440b2 Fix header install 2023-05-25 17:08:05 -05:00
Paul
2e37592520 Remove -Werror 2023-05-25 17:01:27 -05:00
Paul
420c0312c7 Use enum for data types 2023-05-25 16:27:40 -05:00
Paul
3905f4a245 Fix compile errors 2023-05-25 16:10:30 -05:00
Paul
b155a0ac34 Move solution to common.hpp 2023-05-25 15:58:50 -05:00
Paul
e42607a5e6 Fix semicolon 2023-05-25 15:57:54 -05:00
Paul
856419e802 Move another function to cpp file 2023-05-25 15:56:40 -05:00
Paul
dd6fd8bb62 Move functions to cpp file 2023-05-25 15:54:15 -05:00
Alan Turner
61386bf903 Add edatatype and scalars_per_vector workaround 2023-05-25 12:11:09 -07:00
Alan Turner
6289e36f72 Add int8 instances 2023-05-25 11:26:05 -07:00
Alan Turner
6dd246a6f7 Merge remote-tracking branch 'origin/develop' into migx-jit-lib 2023-05-24 11:20:21 -07:00
Alan Turner
dc65f4c65e Use vectors for Ds types and layouts params 2023-05-24 11:06:24 -07:00
Illia Silin
ac9e01e2cc Clean-up the headers (#713)
* fix headers for gpu instances

* remove unused headers

---------

Co-authored-by: zjing14 <zhangjing14@gmail.com>
2023-05-24 08:11:25 -07:00
rocking
76ec0089fb Pool3d fwd (#697)
* Expand the base class of pool2d, prepare to share base class with pool3d

* Add pool3d device op

* Add pool3d f16 example

* Refactor the base class. implement generic pooling in the future

* clang format

* get original index in max pooling

* Add outputindex to base class

* Fix dimension

* Add pooling instance

* Use indexType instead

* Remove useless header

* Extract IndexDataType to template

* Extract pooling reference code

* clang format

* clang format

* Fix typo

* Add tensor stride

* Add missing header

* Add index stride and output stride

* Refine naming

* Add type to base class

* Rename file

* Use proper size

* Fix typo

* Refine naming

* Modify the argument into vector.

* Add max pool profiler

* Refine naming

* Support f32 pool

* Fix typo

* Add avg pool2d fwd in profiler

* clang format

* Rename AccDatatype to ComputeDatatype

* Fix init

* test pool

* Extract variable

* Add client example

* Check the pooling dim

* clang format

* Connect argv and arg_parser

* Add found check

* Remove useless header

* Refine naming

* Adjust the order of device_pool_fwd
2023-05-24 09:05:04 -05:00
Illia Silin
d821d1e54f Enable gemm_dl and other kernels on Navi3x. (#714)
* enable dl kernels on navi3

* do not build xdl tests and examples on Navi

* run tests before building everything on jenkins

* disable gemm_bilinear on gfx1030

* add gpu targets to installer on Navi

* put tests in the same order as before

* reduce the number of navi targets in CI

* build CI installed for gfx940 as well

* only build for MI300 during QA runs
2023-05-23 11:23:16 -05:00
Sam Wu
3cff340423 Documentation Updates (#710)
* update documentation dependencies

add version number to docs

rename doc config directories

enable more doc formats on rtd

add license section in docs
2023-05-18 11:08:38 -06:00
Alan Turner
e2878e2593 Merge remote-tracking branch 'origin/develop' into migx-jit-lib 2023-05-17 06:54:18 -07:00
Bartłomiej Kocot
642d5e9155 Add contraction profiler and tests (#701)
* Add contraction profiler and tests

* Build and style fixes

* Allow to use any elementwise operator for ref_contraction

* Introduce profile_contraction_scale and profile_contraction_bilinear

* Make ref_contraction generic and extend interface tests

* Stylistic minor fixes

* Extend test_contraction_interface
2023-05-15 09:46:52 -05:00
rocking
a1e344b1ae Normalization/split k (#615) 2023-05-11 07:15:02 -05:00
Rostyslav Geyyer
b076a02ad2 Optimize bf16 conversion (#664)
* Add TypeConvert class and start refactoring

* Refactor TypeConvert as a struct

* Get back to template functions type_convert

* Add a type_convert_bf16_rtn, set rtz as default

* Clean up

* Add UnaryConvertPrecision struct for high-precision workloads

* Format

* Update type_convert to UnaryConvert on threadwise level

* Update UnaryConvertPrecision

* Format

* Fix chmod

* Add a flag to pick converion method

* Format

* Remove the added flag

* Merge elementwise op with type conversion

* Move type_convert to elemwise op, update the op

* Update type_convert_precision -> bf16_convert_rtn

* Clean up

* Update comments

* Update the CK_WORKAROUND_DENORM_FIX flag handling

* Update the unneeded op to work but warn user

* Remove the message

* Use a PassThrough instead of ConvertBF16RTN to calcaulate reference

* Format

* Add missing include
2023-05-04 10:25:47 -05:00
Illia Silin
b8635a25b2 Fix the group of quantization_int8 kernels on MI300. (#695)
* replace amd_buffer_atomic_add with hip_atomic_add

* fix grouped_gemm_splitk kernels on mi300

* fix syntax

* revert experimental atomic_add changes

* fix the group of kernels from ticket 723 on MI300

---------

Co-authored-by: Jing Zhang <jizhan@amd.com>
2023-05-03 18:27:04 -05:00
Illia Silin
4a51d2da9d Fix grouped_gemm_splitk kernels on MI300. (#694)
* replace amd_buffer_atomic_add with hip_atomic_add

* fix grouped_gemm_splitk kernels on mi300

* fix syntax

* revert experimental atomic_add changes

---------

Co-authored-by: Jing Zhang <jizhan@amd.com>
2023-05-03 08:25:25 -07:00
Illia Silin
86e0190ec9 update daily build from rocm 5.4.3 to 5.5 (#693) 2023-05-03 08:18:10 -07:00
zjing14
f53ede26e5 fixed init range (#691) 2023-05-02 08:30:23 -07:00
Illia Silin
4feebedd41 Syncing up from internal repo to enable MI300. (#690)
* enable gfx940

* switch between intrinsic mfma routines on mi100/200 and mi300

* fix mfma_int8 on MI300

* disable 2 int8 examples on MI300

* Update cmake-ck-dev.sh

* restore gitignore file

* modify Jenkinsfile to the internal repo

---------

Co-authored-by: Jing Zhang <jizha@amd.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
rocm-5.7.1 rocm-5.7.0
2023-04-28 18:22:59 -05:00
Haocong WANG
54c90aae13 add vector load check (#680)
Co-authored-by: zjing14 <zhangjing14@gmail.com>
2023-04-26 15:58:57 -05:00
Jun Liu
7613c1d9b9 [CK] suppress unsafe buffer warn (#687)
incomplete fix from https://github.com/ROCmSoftwarePlatform/composable_kernel/pull/670

So it does not only happen in gtest but also in CK code:

We need to fix them as a quality improvement, but for now suppressing this warning in immediate releases:
http://compiler-ci.amd.com/blue/rest/organizations/jenkins/pipelines/compiler-psdb-amd-stg-open/runs/2540/nodes/282/steps/3202/log/?start=0

e.g.
```
[2023-04-26T17:26:31.524Z] /jenkins/workspace/compiler-psdb-amd-stg-open/Libs/MIOpen/deps_hip/cget/build/tmp-a3db5da587a64213bde99fb856db1b43/composable_kernel-0f98035df1cc5ba3e90ab03187e672b426a25b00/include/ck/utility/generic_memory_space_atomic.hpp:52:19: error: unsafe pointer arithmetic [-Werror,-Wunsafe-buffer-usage]
[2023-04-26T17:26:31.524Z]         atomicAdd(c_style_pointer_cast<float*>(p_dst) + 1, vx.template AsType<float>()[I1]);
[2023-04-26T17:26:31.524Z]                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
```
[2023-04-26T17:26:31.523Z] /jenkins/workspace/compiler-psdb-amd-stg-open/Libs/MIOpen/deps_hip/cget/build/tmp-a3db5da587a64213bde99fb856db1b43/composable_kernel-0f98035df1cc5ba3e90ab03187e672b426a25b00/include/ck/utility/amd_inline_asm.hpp:62:20: error: 'p_a_half2' is an unsafe pointer used for buffer access [-Werror,-Wunsafe-buffer-usage]
[2023-04-26T17:26:31.523Z]     const half2_t* p_a_half2  = c_style_pointer_cast<const half2_t*>(&a);
[2023-04-26T17:26:31.523Z]     ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
2023-04-26 15:41:03 -05:00
Alan Turner
1ec9671734 Renaming and fix top level cmakelists 2023-04-25 11:01:51 -07:00
Adam Osewski
8bb2bb4a05 Grouped Gemm + SplitK + simplified Kernel Args (#669)
* simplify karg in device/grid split-k op

* fix mk_kn_mn instances

* add more instances

* B2C with 3D grid for KSplit

* Remove unused code.

* Use default B2C (3D grid) in grid gemm v2r4r2.

* Device gemm splitk use B2C map.

* Device GroupedGemmXdlSplitKCShuffle

* Example for GroupedGemm Xdl SplitK

* Introduce Device GroupedGemmSplitK

* Fix updating kbatch size.

* Add instance mk-nk-mn

* Enable set kbatch in profiler.

* Add GGemmSplitK mk-kn-mn instances

* Add more instances & split into multiple files.

* minor fix

* tuning

* clean

* disabled failed instances

* use pipe v2

* Ignore arg on not supported arch.

* fix warning

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
Co-authored-by: Jing Zhang <jizhan@amd.com>
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
2023-04-24 15:43:36 -05:00
Alan Turner
600a987098 Clean up 2023-04-24 11:37:24 -07:00
Alan Turner
497c30e099 Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/composable_kernel into migx-jit-lib 2023-04-24 11:24:34 -07:00
Alan Turner
17acbbf473 Add jit library 2023-04-24 11:15:30 -07:00
zjing14
8b9cbba823 reduce inital number for half_t splitk (#685) 2023-04-24 08:07:39 -05:00
rocking
3eecbfb6ec Revise layout of group convolution (#675)
* [What] Remove pure conv int8 instance
[Why] We will never use pure int8 conv in AI, use int8 quantization instead

* Change layout

* Share the kernel parameter

* Support more type of NHWGC for group conv

* Revise client example of conv 2d, use NHWGC layout

* Add instance to cmake

* Revise layout of group conv quantization instance

* Revise layout of external api of group conv quantization

* Revise layout of group conv quantization client example

* Fix clang format

* Add comment to describe meaning of each parameter
2023-04-23 23:40:00 -05:00
Illia Silin
903cd19ce3 Put back the split-k gemm code. (#684)
* simplify karg in device/grid split-k op

* fix mk_kn_mn instances

* add more instances

* use name from tensor layout

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
2023-04-21 19:37:00 -05:00
Illia Silin
9afa44d40b Switch to the new rocm5.6 compiler. (#681)
* switch to the new rocm5.6 compiler and docker

* fix syntax
2023-04-21 07:59:26 -07:00
Sam Wu
938a5e0e41 Update dependabot config (#682)
Co-authored-by: samjwu <samjwu@users.noreply.github.com>
2023-04-20 21:55:56 -06:00
Illia Silin
bb0b772da9 Allow using ROCm release candidate compilers. (#679)
* enable use of rocm5.5 release candidate 4

* upgrade to ROCM5.5 RC5

* try fix the PUB_KEY error, remove the cmake-data package

* upgrade to latest cmake version

* use private dockerhub repo for rocm5.5 rc5

* add missing bracket
2023-04-18 09:22:49 -07:00
rocking5566
fd11a4a12a Add (#677) 2023-04-17 10:12:10 -05:00
Haocong WANG
fc26d42a2e Fix a typo (#676) 2023-04-15 21:57:34 -05:00
Rostyslav Geyyer
03eaee6ae6 Add more macros to turn on/off denorm fix (#678)
Co-authored-by: Rosty Geyyer <rosty.geyyer@amd.com>
2023-04-15 21:56:07 -05:00
Haocong WANG
e85178b4ca Add memory index guard in wmma device ops (#667) 2023-04-11 15:42:47 -05:00
Jun Liu
f532988713 [gtest] suppress unsafe buffer warn (#670)
ref: https://github.com/ROCmSoftwarePlatform/MIOpen/pull/1912
2023-04-11 15:41:49 -05:00
Sam Wu
fd497f0e79 Add dependabot config and pin rocm-docs-core (#663) 2023-04-11 09:18:38 -06:00
zjing14
c203bf6711 fixed quant example (#672)
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
2023-04-11 07:46:46 -05:00
zjing14
c54f8bcc25 add a marco to turn on/off denorm fix (off by default) (#673)
* add a marco to turn off denorm fix by default

* expose the marco

---------

Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
2023-04-11 07:44:43 -05:00
rocking5566
ed3a2e5226 Groupnorm + swish external api (#668)
* Rename to proper naming

* Add example of groupnorm + swish

* Extract duplicate code in example

* Add groupnorm + swish instances

* Ractor instance generation, split into multiple cpp file

* Add external api and client example

* Refine profiler message

* Use ck math version of exp

* Refine problem size in example

* Add host version of exp
2023-04-10 08:02:17 -05:00
Jun Liu
3248387bbb Issue #666: Revert "simplify karg in device/grid of split-k op (#644)" (#665)
This reverts commit bb5530af91.
2023-04-06 17:14:11 -07:00
zjing14
fde6d2742b add fp64 instances (#658)
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
2023-03-30 13:30:43 -05:00