Commit Graph

10 Commits

Author SHA1 Message Date
Jun Liu
c8a8385fdd [HotFix] add config and version files to pass on build info (#856)
* experiment with config file

* experiment with version.h config

* add more info to version.h

* minor updates

* minor updates

* fix case where DTYPE is not used

* large amount of files but minor changes

* remove white space

* minor changes to add more MACROs

* fix cmakedefine01

* fix issue with CK internal conflict

* fix define and define value

* fix clang-format

* fix formatting issue

* experiment with cmake

* clang format v12 to be consistent with miopen

* avoid clang-format for config file
2023-08-23 11:36:17 -07:00
Illia Silin
189ea3b9aa Add mechanism to build CK for select data types, add Navi3x CI. (#790)
* allow building CK for specific data types

* add CI build and test stage on Naiv3x without some int8 instances

* add missing gemm fp16 instances

* add the changes to the missed cmake file

* add empty lines at end of source files

* Do not build quantization client example on navi3 in CI

* disable batched_gemm_multi_d_int8 instances with DTYPES

* disable device_conv2d_bwd_data_instance with DTYPES

* fix ckprofiler for conv_bwd_data for int8

* properly isolate the conv_bwd_data int8 instances

* remove empty line
2023-07-17 18:02:42 -07:00
Illia Silin
b94fd0b227 update copyright headers (#726) 2023-05-31 18:46:57 -05:00
Po Yen Chen
8784a72e23 Modularize ckProfiler operations (#514)
* Re-structure ckProfiler source files

* Rename profiler.cpp to main.cpp

* Modularize ckProfiler operations

* Add description for profiler operations

* Use longer name to avoid name collision

* Use macro to delay expansion

* Use std::move() to avoid object copying

* Prohibit users from calling dtor

* Use macro to eliminate redundant code

* Make friend function hidden

* Add missing include directive <iostream>

* Fix wrong include directives

* Remove int8 from batchnorm-forward instances since it is not needed for forward training and could fail test

Co-authored-by: Qianfeng Zhang <Qianfeng.Zhang@amd.com>
2022-12-01 15:15:02 -06:00
Chao Liu
500fa99512 Clean up conv example, Instances, profiler and test (#324)
* convnd_fwd fp16 example

* update example

* update example

* update instance

* updating refernce conv

* update reference conv

* update conv fwd profiler

* update conv 1d and 3d instance

* update include path

* clean

* update profiler for conv bwd data and weight

* update conv bwd weight

* clean

* update conv example

* update profiler for conv bwd weight

* update ckprofiler for conv bwd data

* fix reference conv bwd data bug; update conv bwd data test

* update examples

* fix initialization issue

* update test for conv fwd

* clean

* clean

* remove test case too sensitive to error threshhold

* fix test

* clean

* fix build

* adding conv multiple d

* adding conv multiple D

* add matrix padder

* add gemm padding to convnd

* adding group conv

* update gemm multi-d

* refactor

* refactor

* refactor

* clean

* clean

* refactor

* refactor

* reorg

* add ds

* add bias

* clean

* add G

* adding group

* adding group

* adding group

* update Tensor

* clean

* update example

* update DeviceGemmMultipleD_Xdl_CShuffle

* update conv bwd-data and bwd-weight

* upate contraction example

* update gemm and batch gemm with e permute

* fix example build

* instance for grouped conv1d

* update example

* adding group conv instance

* update gemm bilinear instance

* update gemm+add+add+fastgelu instance

* update profiler

* update profiler

* update test

* update test and client example

* clean

* add grouped conv into profiler

* update profiler

* clean

* add test grouped conv, update all conv test to gtest

* update test
2022-07-29 18:19:25 -05:00
Shaojie WANG
b31b588dd2 remove unused conv bwd data profiler header and cpp (#245) 2022-05-20 12:34:23 -05:00
JD
cec69bc3bc Add host API (#220)
* Add host API

* manually rebase on develop

* clean

* manually rebase on develop

* exclude tests from all target

* address review comments

* update client app name

* fix missing lib name

* clang-format update

* refactor

* refactor

* refactor

* refactor

* refactor

* fix test issue

* refactor

* refactor

* refactor

* upate cmake and readme

Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-12 09:21:01 -05:00
ltqin
0536f2b312 Unified implementation of 1d/2d/3d conv bwd-data. fp32/fp16/bfp16/int8 (#134)
* start convnd bwd data

* add 3d laoyout name

* add conv1d reference

* add con3d reference

* finished example client code

* conv1d kernel finished

* fix input error

* add conv3d

* add 3d layout in conv_utils.hpp

* fix sepecial check

* addconvnd lib

* add test for bwd data

* finished test

* add check slice length

* convnd bwd data start

* profiler can be compiled

* fix some bug

* set input to zero

* modify readme for example

* fix test_convnd_bwd_data bug

* test_convnd_bwd_data parameter desc

* workaround for 1d

* workaroud for 2d

* change init value

* workaround for 3d int8

* fix init value bug

* remove workaround

* fix acc data type

* add int32

* change select function to template

* tilda to tilde

* remove int32 instance

* fix commit for device hpp

* fix comments for profiler

* using profile imp to test

* add pass verification

* fix conv2d reference

* fix conflict

* remove double batched_gemm

* fix exampel conv2d data and test convnd

* format

* change conv2d_bwd_data return value

* remove repeat = 1

* remove conv bwd data

Co-authored-by: ltqin <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-03-29 10:52:25 -05:00
Chao Liu
f95267f166 Gemm+Reduce Fusion (#128)
* add gridwise gemm v4r1

* rename

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* use sfc in shuffling

* remove hardcode

* remove hardcode

* refactor

* fix build

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* format

* clean

* adding gemm+reduce

* adding profiler for gemm+reduce

* adding gemm+reduce profiler

* fix build

* clean up

* gemm+reduce

* fix build

* update DeviceGemm_Xdl_CShuffle; update enum to enum class

* clean up

* add test for gemm+reduce

* clean up

* refactor

* fix build

* fix build
2022-03-23 22:18:42 -05:00
ltqin
c254e5abd2 NHWC conv 2d: bwd fp32/fp16/bfp16/int8, Device level tuning and host API (#92)
* start conv2d bwd api

* kernel running

* add bwd reference

* change to no shuffle

* fix bwd reference

* pass verification

* add Filter1x1Stride1Pad0 and start testing

* change some tuning parameter

* fix test error

* add fp16 tuning parameter

* add bf16 tuning parameter

* add int8 tuning parameters

* change fp32 tuning parameter

* add bwd to profiler

* fix bug for bwd profiler

* fix ckProfiler bug

* change conv2d_bwd_xdl to fp16

* fix bug in comments

* fix precompile id

* fix enum conv name

* chage _bwd_ to _bwd_data_

* change conv2d_bwd example id

* bwd to bwd data

* fix prehead

* fix MakeDefaultBlock2CTileMap ,import form merge develop

* format bwd instance

* bwd to bwd data

* change name bwd to bwd data

* change name bwd to bwd data in example

* formate code

* change conv2d bwd data id in example

* rewrite readme for example

* fix CalculateMagicNumbers about div zero

* add workaround CK_WORKAROUND_SWDEV_325164

* change test_conf2d_bwd_data show info

* format

* fix bug for workaround:CK_WORKAROUND_SWDEV_325164

* formate tuning parameters

* formate tuning parameters again

* formate tuning parameters 3

* formate tuning parameters 4

* remove add function template

* format

* update comment

Co-authored-by: ltqin <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-03-04 00:08:26 -06:00