Commit Graph

73 Commits

Author SHA1 Message Date
Illia Silin
006e903d91 fix the cmake option syntax (#1117)
[ROCm/composable_kernel commit: fbf31a2ea3]
2024-01-03 07:56:44 -08:00
Illia Silin
b048f22fb2 adding -Wno-switch-default compiler flag (#1115)
[ROCm/composable_kernel commit: b268f273de]
2024-01-02 14:01:12 -08:00
Artur Wojcik
a4bd3ff6db enable compilation of INSTANCES_ONLY for Windows (#1082)
* enable compilation of INSTANCES_ONLY for Windows

* suppress ROCMChecks warnings on GoogleTests

* suppress -Wfloat-equal warning on GoogleTests

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

[ROCm/composable_kernel commit: fb5bd51b42]
2023-12-20 14:34:53 -08:00
Jun Liu
d29b1436fb ROCm 6.0 replaces all __HIP_PLATFORM_HCC__ with __HIP_PLATFORM_AMD__ (#1106)
* ROCm 6.0 replaces all __HIP_PLATFORM_HCC__ with __HIP_PLATFORM_AMD__

* make it backward compatible

* Update .clang-tidy

* Update ClangTidy.cmake

[ROCm/composable_kernel commit: 3ab1838fb0]
2023-12-19 07:16:49 -08:00
Illia Silin
d7a9aeeab6 add -Wno-pass-failed compiler flag (#1105)
[ROCm/composable_kernel commit: 3726a1730e]
2023-12-19 07:15:24 -08:00
trixirt
04d5dfa382 cmake: Add CK_PARALLEL_LINK_JOBS and CK_PARALLEL_COMPILE_JOBS options (#1063)
Copied from the llvm-project LLVM_PARALLEL_*_JOBS

Concurrent linking can break the build as well as having too many
compile jobs for the avaiable memory.  These options allow the user
to fine tune the build to fit within their machines memory
constraints.

An example use on linux is
COMPILE_JOBS=`cat /proc/cpuinfo | grep -m 1 'cpu cores' | awk '{ print $4 }'`
if [ ${COMPILE_JOBS}x = x ]; then
  COMPILE_JOBS=1
fi
BUILD_MEM=4
MEM_KB=0
MEM_KB=`cat /proc/meminfo | grep MemTotal | awk '{ print $2 }'`
MEM_MB=`eval "expr ${MEM_KB} / 1024"`
MEM_GB=`eval "expr ${MEM_MB} / 1024"`
COMPILE_JOBS_MEM=`eval "expr 1 + ${MEM_GB} / ${BUILD_MEM}"`
if [ "$COMPILE_JOBS_MEM" -lt "$COMPILE_JOBS" ]; then
  COMPILE_JOBS=$COMPILE_JOBS_MEM
fi
LINK_MEM=32
LINK_JOBS=`eval "expr 1 + ${MEM_GB} / ${LINK_MEM}"`

cmake -G Ninja -DCK_PARALLEL_LINK_JOBS=$LINK_JOBS
               -DCK_PARALLEL_COMPILE_JOBS=$COMPILE_JOBS

Signed-off-by: Tom Rix <trix@redhat.com>

[ROCm/composable_kernel commit: efaf31061a]
2023-12-14 17:26:41 -08:00
Illia Silin
e1a51c96fc Add daily run with mainline compiler. (#1075)
* add daily build with mainline compiler

* fix the compiler paths for ci

* remove the -flto flag

* build with clang by default

[ROCm/composable_kernel commit: afe4622014]
2023-12-04 19:04:52 -08:00
Illia Silin
1a389f2d2e Enable sccache in the default docker and CI. (#1009)
* replace ccache with sccache, pin package versions

* put ccache back temporarily to avoid breaking other CI jobs

* add sccashe_wrapper.sh script

* fix the package version syntax

* fix the pymysql package issue

* run sccache_wrapper before build if ccache server found

* set the paths before calling the sccache_wrapper

* use /tmp instead of /usr/local for cache

* try using sccache --start-server instead of wrapper

* try using redis server with sccache

* define SCCACHE_REDIS

* add redis and ping packages, and redis port

* use the new sccache redis server

* do not use sccache with staging compiler

* fix the condition syntax

* add stunnel to redis

* add tunnel verification

* separate caches for different architectures

* fix syntax for the cache tag

* quse double brackets for conditions

* add bash line to the script

* add a switch for sccache and only use it in build stage

* run check_host function when enabling sccache

* fix the invocation tags for sccache

* fix groovy syntax

* set the invocation tag in groovy

* disable sccache in clang-format stage

* try another syntax for invocation tags

* use local sccache server if can't connect to redis

* fix script syntax

* update README

* refresh readme

* readme updates

* remove the timing and verification caveat from readme

---------

Co-authored-by: Lisa Delaney <lisa.delaney@amd.com>

[ROCm/composable_kernel commit: 4e44a9e8da]
2023-10-30 13:16:29 -07:00
zjing14
dc94c20258 Clean DTYPES conditions in CMake (#974)
* Add a condition to build fp8 instances

* simplified buffer_load/store

* add bfp8/fp8

* fixed

* remove all f8/bf8 condition include folder

* fixed cmake conditions

* fixed DTYPES=fp16/bfp16

* fix

* fixed buffer_load

* fixed buffer_store

* fix

* clean example cmake files

* fixed ci

* fixed cit

---------

Co-authored-by: Rostyslav Geyyer <rosty.geyyer@amd.com>
Co-authored-by: Jing Zhang <jizha@amd.com>

[ROCm/composable_kernel commit: bf435140dc]
2023-10-18 11:14:14 -05:00
Illia Silin
6a8658812a get rid of gfx900/906, set rocm5.7 as default (#958)
[ROCm/composable_kernel commit: 59dbb01fd1]
2023-10-02 12:01:11 -07:00
Illia Silin
f9ce51a187 Use lower case for ckprofiler package. (#948)
* split ckProfiler gfx9 package into gfx90 and gfx94

* use lower case for package names

[ROCm/composable_kernel commit: 420b5a0382]
2023-09-26 17:43:09 -07:00
Illia Silin
37f4626e3e split ckProfiler gfx9 package into gfx90 and gfx94 (#946)
[ROCm/composable_kernel commit: 0b296a2722]
2023-09-26 11:22:31 -07:00
Illia Silin
99024ff371 Resolve some data type issues and cmake policy. (#940)
* split the types in gemm_bilinear instances, add condition to cmake policy

* fix syntax

* split the data types in batchnorm examples

* fix the batchnorm_bwd test

* fix types in the batchnorm_bwd test

[ROCm/composable_kernel commit: 2ea75bd6d7]
2023-09-26 08:39:11 -07:00
Illia Silin
3609ff10f7 Refactoring cmake files to build data types separately. (#932)
* refactor cmake files for the tests

* refactor cmake files for examples

* fix cmake for gemm example

* fix the cmake file for all examples

* add splitting by data types in gemm_splitk instance header

* rename test to reflect only dl instances are used

* clean up CI workspace, update cmake for instances

* change the jenkinsfile syntax

* build all instances except DL on gfx11

* move workspace cleanup after stages

* clean up workspace after every stage

* isolate data types in grouped_conv_fwd header

* isolate dl instances for grouped_conv2d_fwd

* fix syntax

* fix cmake and batchnorm instances

* fix typo

* fix reduction instances

* fix grouped_conv headers

* fix syntax

* replace parsing logic for instances, replace bfp16 with bf16

* fix the client examples build

* clean up DTYPES from instances cmake files

* update the parsing logic in cmake files

* make an exception for reduction kernels

* update few remaining cmake files to handle DTYPES

* fix syntax

* fix cmake conflicts

* replace f8 with fp8 test name

* resolve conflicts for dpp instances

[ROCm/composable_kernel commit: bba085d2b5]
2023-09-20 22:15:56 -07:00
Illia Silin
91cb870871 fix the ckprofiler package build in a loop (#926)
[ROCm/composable_kernel commit: 5a4416c8a7]
2023-09-19 09:17:39 -07:00
Jun Liu
f9e7629556 [Cmake] Set cmake default build type Release and path to /opt/rocm (#914)
[ROCm/composable_kernel commit: 5fe687fa27]
2023-09-13 14:38:12 -07:00
Rostyslav Geyyer
0752117077 Refactor f8_t, add bf8_t (#792)
* Refactor f8_t to add bf8_t

* Add check_err impl for f8_t

* Update fp8 test

* Format

* Revert the fix

* Update vector_type implementation

* Add bf8 test

* Add bf8, use BitInt types

* Add bf8 conversion methods

* Update type_convert for fp8/bf8

* Add check_err fp8/bf8 support

* Add subnorm fp8 tests

* Add subnorm bf8 tests

* Fix conversion

* Add bf8 cmake bindings

* Add macros to enable build with disabled fp8/bf8

* Remove is_native method

* Update flag combination for mixed precision instances

* Add more flag checks

* Add another flag to a client example

* Add type traits, decouple f8/bf8 casting

* Clean up

* Decouple fp8 and bf8 flags

* Remove more redundant flags

* Remove leftover comments

[ROCm/composable_kernel commit: 62d4af7449]
2023-09-12 17:04:27 -05:00
Lauren Wrubleski
766b5dc9d7 Fix config header installation (#880)
[ROCm/composable_kernel commit: bd8024b84a]
2023-09-04 09:49:40 -07:00
Jun Liu
2fb9a37881 [HotFix] add config and version files to pass on build info (#856)
* experiment with config file

* experiment with version.h config

* add more info to version.h

* minor updates

* minor updates

* fix case where DTYPE is not used

* large amount of files but minor changes

* remove white space

* minor changes to add more MACROs

* fix cmakedefine01

* fix issue with CK internal conflict

* fix define and define value

* fix clang-format

* fix formatting issue

* experiment with cmake

* clang format v12 to be consistent with miopen

* avoid clang-format for config file

[ROCm/composable_kernel commit: c8a8385fdd]
2023-08-23 11:36:17 -07:00
Illia Silin
33edc7449a Update the rocm version threshold to apply the -fno-offload-uniform-block flag. (#839)
* add fno-offload-uniform-block flag for rocm5.7 and up

* add a comment and compiler ticket number

* update the threshold rocm version

[ROCm/composable_kernel commit: cbbd172fd6]
2023-08-09 13:50:04 -07:00
Illia Silin
62bf177e98 add no-offload-uniform-block flag for rocm5.7 and up (#838)
* add -fno-offload-uniform-block flag for rocm5.7 and up

* add a comment and compiler ticket number

[ROCm/composable_kernel commit: 6802611334]
2023-08-08 17:58:31 -07:00
Illia Silin
79481b9486 add an option to build ckProfiler package for specific architectures (#828)
[ROCm/composable_kernel commit: 2474dddbee]
2023-08-03 10:10:27 -07:00
Illia Silin
f83a1c84c3 Disable DL kernels by default. (#816)
[ROCm/composable_kernel commit: 9195435c77]
2023-07-26 11:06:45 -05:00
Illia Silin
7e00942316 add INSTANCES_ONLY cmake macro to build only instances (#807)
[ROCm/composable_kernel commit: 7a29f711d4]
2023-07-21 15:31:19 -07:00
Illia Silin
74c83ffe26 Add mechanism to build CK for select data types, add Navi3x CI. (#790)
* allow building CK for specific data types

* add CI build and test stage on Naiv3x without some int8 instances

* add missing gemm fp16 instances

* add the changes to the missed cmake file

* add empty lines at end of source files

* Do not build quantization client example on navi3 in CI

* disable batched_gemm_multi_d_int8 instances with DTYPES

* disable device_conv2d_bwd_data_instance with DTYPES

* fix ckprofiler for conv_bwd_data for int8

* properly isolate the conv_bwd_data int8 instances

* remove empty line

[ROCm/composable_kernel commit: 189ea3b9aa]
2023-07-17 18:02:42 -07:00
Illia Silin
2ce9e7a4cf Add check for compiler GPU target support. (#800)
* check if gpu_targets are supported by compiler

* set default list of targets and filter for them

[ROCm/composable_kernel commit: 4867db4290]
2023-07-17 09:44:40 -07:00
Haocong WANG
ec634a3d32 Add CMake Option "USE_OPT_NAVI3X" (#647)
* Add CMake Option "USE_OPT_NAVI3X"

* remove navi3x opt compile option from cmake script

[ROCm/composable_kernel commit: 4e097ad283]
2023-03-29 14:07:33 -05:00
Lauren Wrubleski
a4d6101e99 Add packages for examples and profiler (#502)
* Add packages for example and profiler

* correct TEST_NAME -> EXAMPLE_NAME

[ROCm/composable_kernel commit: 37f2e91832]
2022-11-10 13:19:33 -06:00
Illia Silin
47484e4eea reduce the number of default targets (#489)
* reduce the number of default targets

* re-write the setting of target flags

* move all options to one place

* add new custom target instances for installing CK

[ROCm/composable_kernel commit: a5059f8f90]
2022-10-27 15:17:56 -06:00
Illia Silin
6fb932714d Add an option to build CK with clang directly (#387)
* replace hipcc compiler with clang++

* build client app with hipcc

* build client app with clang

* add an option to build with hipcc ro clang

* fix the environment for client app

* fix setting up compiler in cmake_build

* change the way the compiler is set

[ROCm/composable_kernel commit: 1e5b59df22]
2022-08-26 12:51:39 -05:00
Adam Osewski
2332d3657d int4 data type (#364)
* Introduce int4 data type.

* Add unit-tests for int4

* Compile int4 UT only when int4 enabled.

* clang-format

Co-authored-by: Adam Osewski <aosewski@amd.com>

[ROCm/composable_kernel commit: e00149ac67]
2022-08-18 14:53:47 -05:00
Anthony Chang
cc5140ac96 Fused attention (#345)
* initial stub for gemm_gemm_xdl_cshuffle

* set up example code

* compiles

* prevent integer overflow

* harmonize interface between ref_gemm and ref_batched_gemm

* batched_gemm_gemm

* fix example

* host tensor gen: diagonal pattern in lowest two-dimensions only

* make c descriptors containing only integral constants

* clean up

* add BlockwiseGemmXdlops_v2 while exploring an unified approach

* implement proper interface

* tidy up example

* fix compilation warnings

* coarsely controlled 2nd gemm padding

* remove rocm-cmake's hard requirement for certain revision

* clang-format

* resolve merge conflict

* fix compilation error on gfx10

* adds acc0 elementwise op to interface

* attention host validation

* add blockwsie softmax v1

* iteratively update softmax+gemm

* transpose both gemm0 and gemm1 xdl output so as to avoid broadcasting softmax max/sum

* add init method for easier debugging

* do away with manual thread cluster calculation

* generalize blockwise softmax interface

* row-wise softmax sum & max

* format

* rename to DeviceBatchedGemmSoftmaxGemm

* add gemm_softmax_gemm instances and tests

* comment

Co-authored-by: ltqin <letao.qin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: cac014f173]
2022-08-13 00:16:14 -05:00
Liam Wrubleski
de7b2f75d9 Remove incorrect old packaging statement (#308)
[ROCm/composable_kernel commit: eccf8773a6]
2022-06-30 09:40:03 -05:00
Liam Wrubleski
ef57890712 Switch to standard ROCm packaging (#301)
* Switch to standard ROCm packaging

* Revert .gitignore changes

* install new rocm-cmake version

* update readme

Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: b653c5eb2e]
2022-06-25 09:35:16 -05:00
Chao Liu
9df0a11a51 Absolute include path (#281)
* ad gelu and fast_gelu

* added GeLU and fast GeLU

* clean up

* add gemm+fastgelu example

* add gemm+gelu instances

* update profiler

* clean up

* clean up

* adding gemm+bias+activation

* clean

* adding bias

* clean

* adding gemm multiple d

* debugging

* add gemm bias add fastgelu

* rename, clean

* refactoring; add readme

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* fix

* fix

* update example

* update example

* rename

* update example

* add ckProfiler

* clean

* clean

* clean

* clean

* add client app example

* update readme

* delete obselete files

* remove old client app

* delete old file

* cleaning

* clean

* remove half

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path for all examples

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* revert client app example

* clean build

* fix build

* temporary disable client test on Jenkins

* clean

* clean

* clean

[ROCm/composable_kernel commit: d1db6a0c3e]
2022-06-24 20:51:04 -05:00
Chao Liu
983972fedf remove options.hpp.in (#240)
[ROCm/composable_kernel commit: 44943e0e21]
2022-05-20 14:40:12 -05:00
Anthony Chang
3a574a0f5c Validate examples in CI (#233)
* validate examples in ctest runs

* format

* fix usage of check_err

* amend

* add example codes to custom target 'check'

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: 9f71ff48e2]
2022-05-13 16:54:44 -05:00
JD
569dd9f47b Add host API (#220)
* Add host API

* manually rebase on develop

* clean

* manually rebase on develop

* exclude tests from all target

* address review comments

* update client app name

* fix missing lib name

* clang-format update

* refactor

* refactor

* refactor

* refactor

* refactor

* fix test issue

* refactor

* refactor

* refactor

* upate cmake and readme

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: cec69bc3bc]
2022-05-12 09:21:01 -05:00
Adam Osewski
159494284d Introduce GoogleTest framework. (#204)
* Use googletest for tests. Add conv2d_fwd UT.

* Add conv1D/3D to gtest UT.

* Fix: not duplicate test with CTest.

* Convert more tests to googltests.

* Fix: GIT_SHALLOW is not allowed for git commit hash.

* Clang-format

* use integer value for GEMM test

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Chao Liu <lc.roy86@gmail.com>

[ROCm/composable_kernel commit: 8eca05a633]
2022-04-30 08:50:16 -05:00
Chao Liu
82ad74304e Reorganize files, Part 1 (#119)
* delete obselete files

* move files

* build

* update cmake

* update cmake

* fix build

* reorg examples

* update cmake for example and test

[ROCm/composable_kernel commit: 5d37d7bff4]
2022-03-08 21:46:36 -06:00
JD
019ea09acf Update test CMakeLists to add new tests automatically and add Jenkins stage for tests (#88)
* add docker file and make default target buildable

* add Jenkinsfile

* remove empty env block

* fix package stage

* remove render group from docker run

* clean up Jenkins file

* add cppcheck as dev dependency

* update cmake file

* Add profiler build stage

* add hip_version config file for reduction operator

* correct jenkins var name

* Build release instead of debug

* Update test CMakeLists.txt
reorg test dir
add test stage

* reduce compile threads to prevent compiler crash

* add optional debug stage, update second test

* remove old test target

* fix tests to return proper results and self review

* Fix package name and make test run without args

* change Dockerfile to ues rocm4.3.1

* remove parallelism from build

* Lower paralellism

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: 992f71e371]
2022-03-03 16:59:42 -06:00
JD
ba2852c7a3 Initial Setup for CI (#86)
* add docker file and make default target buildable

* add Jenkinsfile

* remove empty env block

* fix package stage

* remove render group from docker run

* clean up Jenkins file

* add cppcheck as dev dependency

* update cmake file

* Add profiler build stage

* add hip_version config file for reduction operator

* correct jenkins var name

* Build release instead of debug

* clean up

Co-authored-by: Chao Liu <chao.liu2@amd.com>

[ROCm/composable_kernel commit: 2778e99758]
2022-02-18 21:44:11 -06:00
Chao Liu
8efcb80fa5 GEMM+Bias+ReLU+Add (#76)
* tweak conv for odd C

* update script

* clean up elementwise op

* fix build

* clean up

* added example for gemm+bias+relu+add

* added example for gemm+bias+relu

* add profiler for gemm_s_shuffle; re-org files

* add profiler

* fix build

* clean up

* clean up

* clean up

* fix build

[ROCm/composable_kernel commit: 823657ed12]
2022-02-06 22:32:47 -06:00
Chao Liu
7ead49ca42 added test for magic number division (#58)
[ROCm/composable_kernel commit: 237d4ca03f]
2021-11-30 09:09:28 -06:00
Chao Liu
b9f9ed96ac ckProfiler and device-level XDL GEMM operator (#48)
* add DeviceGemmXdl

* update script

* fix naming issue

* fix comment

* output HostTensorDescriptor

* rename

* padded GEMM for fwd v4r4r4 nhwc

* refactor

* refactor

* refactor

* adding ckProfiler

* adding ckProfiler

* refactor

* fix tuning parameter bug

* add more gemm instances

* add more fp16 GEMM instances

* fix profiler driver

* fix bug in tuning parameter

* add fp32 gemm instances

* small fix

* refactor

* rename

* refactor gemm profiler; adding DeviceConv and conv profiler

* refactor

* fix

* add conv profiler

* refactor

* adding more GEMM and Conv instance

* Create README.md

Add build instruction for ckProfiler

* Create README.md

Add Readme for gemm_xdl example

* Update README.md

Remove build instruction from top most folder

* Update README.md

* clean up

[ROCm/composable_kernel commit: e823d518cb]
2021-11-14 11:28:32 -06:00
Chao Liu
04d90a65c5 refactor
[ROCm/composable_kernel commit: 16effa767c]
2021-08-16 20:36:47 +00:00
Chao Liu
a0df0eb029 fix kernel filename
[ROCm/composable_kernel commit: 2c48039d0e]
2021-08-10 22:15:23 +00:00
Chao Liu
a15f7c025f fix clang warning suppression
[ROCm/composable_kernel commit: ddd49ec9e7]
2021-08-10 06:20:24 +00:00
Chao Liu
72c6d11a3b tidy
[ROCm/composable_kernel commit: 76f3131939]
2021-08-09 18:49:59 -05:00
Chao Liu
7b306ba01c tidy
[ROCm/composable_kernel commit: d18428901e]
2021-08-09 18:20:02 -05:00