Commit Graph

14 Commits

Author SHA1 Message Date
Chao Liu
d3051d7517 add license in file (#303) 2022-06-24 23:32:43 -05:00
Chao Liu
d1db6a0c3e Absolute include path (#281)
* ad gelu and fast_gelu

* added GeLU and fast GeLU

* clean up

* add gemm+fastgelu example

* add gemm+gelu instances

* update profiler

* clean up

* clean up

* adding gemm+bias+activation

* clean

* adding bias

* clean

* adding gemm multiple d

* debugging

* add gemm bias add fastgelu

* rename, clean

* refactoring; add readme

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* fix

* fix

* update example

* update example

* rename

* update example

* add ckProfiler

* clean

* clean

* clean

* clean

* add client app example

* update readme

* delete obselete files

* remove old client app

* delete old file

* cleaning

* clean

* remove half

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path for all examples

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* fix header path

* revert client app example

* clean build

* fix build

* temporary disable client test on Jenkins

* clean

* clean

* clean
2022-06-24 20:51:04 -05:00
Adam Osewski
a2edd7d802 Testing all fwd convolution specializations. (#259)
* UniforFill with integer values.

* Log tested instance type string.

* Add UT for all convolution specializations.

* debugging conv

* Fix dangling reference bug.

* Small refinements.

* Fix call to error checking function.

* Small refinements to tests.

* Configure error tolerance
* Change problem size.
* Remove OddC case from types that do not support it.

* Add helper traits for AccumulatorDataType.

* Print first 5 errs in check_err for integral types.

* Rename FillUniform to FillUniformDistribution

* Refactor

* Do not use typed tests.
* Instead use plain fixture class with templatized member functions.
* Initialize tensors with integer values.

* Refine test instances.

* Properly set accumulator data type.
* Add another "big" instance.

* Refactor convolution tests.

* Revert "debugging conv"

This reverts commit b109516455.

* Add pragma once + format + small refinement.

* Fix some unwanted changes.

* Clang-format

* Fix profile_convnd to use renamed tensor initializer.

* Add instances for ConvFWDND kernel case 2D

* Helpers to get ConvNDFwd 2D instances.

* Refactoring.

* Remove "small block" instance as it was generating compiler errors.
* Remove default template parameters values.

* Refine and fix test.

* Fix problem with default template parameter types.
* Adjust error thresholds for floating point values test.
* Use integer values initialization for instances test.
* Add tests for ConvNDFwd 2D case.

* Remove AccumulatorDataType type trait.

* Update unit-tests.

* Remove operator<< overload.

* Unlock conv1d/3d nd fwd instances.

* Enable skipping calculating reference using flag.

* Fix number of channels for first ResNet50 layer.

* Clang-format.

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-06-22 22:05:04 -05:00
Chao Liu
85fc91c321 Minor fix for recent PR (#260)
* fix example

* update IsSupportedArgument

* fix

* disable fp64 conv example as test
2022-05-30 19:57:49 -05:00
Chao Liu
91d8b7d67a Fixing conv bug (#258)
* debugging conv

* fix oversight where ctile map is constructed before initializing c desc

* example program should returns error code

* clean up

* changed Block2CTileMap in conv2d and convnd

* clean up

* clean up

* cleanup

Co-authored-by: Anthony Chang <ac.chang@outlook.com>
2022-05-27 09:29:37 -05:00
ltqin
3e6c2610ae Add FP64 XDL GEMM built-in function (#199)
* add intrin_mfma_f64_16x16x4f64

* add example

* gemm reference add double data type

* chang init data

* fix M N PerXdlops

* fix ifdef

* add comparsion config

* add conv fwd example

* format log out

* change rc matrix egister layout

* reorganize example

* reorganize example 2

* format,because merge develop

* fix call impl adding acc data type

* lost ;

* add compiler warning

* change example tunning parameters

* add test for fp64

* add instance

* add test/gemm/gemm_fp64.cpp

* fix get name issue

* remove some tunning parameter

* fix conflict

* format

* use integer value for GEMM test

* add acc data type

* remove typeid because fp16

* fix streamconfig etc bug from merging develop

* format

* remove test_gemm_xdl_fp64

* add AccDataType

* AccDataType problem

Co-authored-by: qinletao <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-26 14:48:57 -05:00
Anthony Chang
9f71ff48e2 Validate examples in CI (#233)
* validate examples in ctest runs

* format

* fix usage of check_err

* amend

* add example codes to custom target 'check'

Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-13 16:54:44 -05:00
JD
cec69bc3bc Add host API (#220)
* Add host API

* manually rebase on develop

* clean

* manually rebase on develop

* exclude tests from all target

* address review comments

* update client app name

* fix missing lib name

* clang-format update

* refactor

* refactor

* refactor

* refactor

* refactor

* fix test issue

* refactor

* refactor

* refactor

* upate cmake and readme

Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-12 09:21:01 -05:00
Adam Osewski
712e464c4e Post PR183 review fixes. (#224)
* Suppress additional warnings for googltest.

* Rename file conv_fwd_util to conv_util.

* Update includes and ConvParams member access.

* Formatting.

* Change conv_fwd_util target to conv_util

* Fix compiler errors.

* Fix leftovers.

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-10 15:41:29 -05:00
Adam Osewski
1a0cd5d160 Convolution FWD profiler refactor. (#183)
* Convolution ND

* Code unification across dimensions for generating tensor descriptors.
* Example
* Instances

* Move convnd f32 instance file to comply with repo structure.

* Conv 1D tensor layouts.

* Formatting and use ReferenceConv

* Reference ConvFwd supporting 1D and 2D convolution.

* Debug printing TensorLayout name.

* Conv fwd 1D instance f32

* Refactor conv ND example.

Needed to support various conv dimensio.

Needed to support various conv dimensions

* Rename conv nd example director to prevent conflicts.

* Refactor some common utility to single file.

Plus some tests.

* Refactor GetHostTensorDescriptor + UT.

* Add 1D test case.

* Test reference convolution 1d/2d

* Remove some leftovers.

* Fix convolution example error for 1D

* Refactor test check errors utility function.

* Test Conv2D Fwd XDL

* More UT for 1D case.

* Parameterize input & weight initializers.

* Rename example to prevent conflicts.

* Split convnd instance into separate files for 1d/2d

* Address review comments.

* Fix data type for flops/gbytes calculations.

* Assign example number 11.

* 3D cases for convolution utility functions.

* 3D reference convolution.

* Add support for 3D convolution.

* Check for inputs bigger than  2GB.

* Formatting

* Support for bf16/f16/f32/i8 - conv instances + UT.

* Use check_err from test_util.hpp.

* Split convnd test into separate files for each dim.

* Fix data generation and use proper instances.

* Formatting

* Skip tensor initialization if not necessary.

* Fix CMakefiles.

* Remove redundant conv2d_fwd test.

* Lower problem size for conv3D UT.

* 3D case for convnd example.

* Remove leftovers after merge.

* Add Conv Specialization string to GetTypeString

* Skip instance causing numerical errors.

* Small fixes.

* Remove redundant includes.

* Fix namespace name error.

* Script for automatic testing and logging convolution fwd UTs

* Comment out numactl cmd.

* Refine weights initalization and relax rtol for fp16

* Move test_util.hpp to check_err.hpp

* Refine weights initalization and relax rtol for fp16

* Refactor common part of test conv utils.

* Move utility function to single common place.

* Add additional common functions to utility.

* Refactor convnd_fwd_xdl examples.

* Remove redundant files.
* Unify structure.

* Add constructor to ConvParams.

* And add input parameters validation.

* Modify conv examples to use single utility file.

* Remove check_error from host_tensor.hpp

* Get rid of check_indices function.

* Remove bf16_to_f32 function overload for scalars.

* Fix namespace.

* Add half_float::half for check_err.

* Fix conv params size in UT.

* Fix weights initialization for int8.

* Fix weights initialization for int8.

* Add type_convert when store output in ref conv 1D.

* Get back old conv2d_fwd_xdl operation.

* Silence conv debug print.

* format

* clean

* clean

* Fix merge.

* Fix namespace for check_err

* Formatting.

* Fix merge artifacts.

* Remove deleted header.

* Fix some includes and use ck::utils::check_err.

* Remove unused check_indices restored by previous merge.

* Fix namespaces after merge.

* Fix compilation error.

* Small fixes.

* Use common functions.
* Fix filename
* Fix namespaces.

* Fix merge artifact - retrieve removed by accident fun.

* Fix ConvForwardSpecialization.

* Working example of OpInstanceRunEngine for conv2dfwd UT.

* Adhere to coding style rules.

* Formatting and adhere to coding style rules.

* Fix merge artifacts.

* Utility for collecting conv fwd instances.

+ Plus commmon part for parsing cmdline params.

* Refactor FillUniform because of segfault for int8_t.

* Naming convention.

* Elegant version of device mem allocation.

* Use OpInstanceRunEngine in conv fwd nd tests.

* Multiple refinements.

* conditional init
* don't run reference op if not provided.

* Use OpInstanceRunEngine for ckProfiler conv_fwd

* Refactor common tensor fill function to separate file.

* Clean up unused functions.

* Support different init methods.

* Create CMake target for conv_fwd_util.

* Add header for profile_convnd_fwd.cpp

* Fix CMakefiles to link with conv_fwd_util where needed.

* Fix some clutter.

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-04-21 17:39:39 -05:00
Adam Osewski
abf4bdb9a9 Common forward convolution utility refactor. (#141)
* Convolution ND

* Code unification across dimensions for generating tensor descriptors.
* Example
* Instances

* Move convnd f32 instance file to comply with repo structure.

* Conv 1D tensor layouts.

* Formatting and use ReferenceConv

* Reference ConvFwd supporting 1D and 2D convolution.

* Debug printing TensorLayout name.

* Conv fwd 1D instance f32

* Refactor conv ND example.

Needed to support various conv dimensio.

Needed to support various conv dimensions

* Rename conv nd example director to prevent conflicts.

* Refactor some common utility to single file.

Plus some tests.

* Refactor GetHostTensorDescriptor + UT.

* Add 1D test case.

* Test reference convolution 1d/2d

* Remove some leftovers.

* Fix convolution example error for 1D

* Refactor test check errors utility function.

* Test Conv2D Fwd XDL

* More UT for 1D case.

* Parameterize input & weight initializers.

* Rename example to prevent conflicts.

* Split convnd instance into separate files for 1d/2d

* Address review comments.

* Fix data type for flops/gbytes calculations.

* Assign example number 11.

* 3D cases for convolution utility functions.

* 3D reference convolution.

* Add support for 3D convolution.

* Check for inputs bigger than  2GB.

* Formatting

* Support for bf16/f16/f32/i8 - conv instances + UT.

* Use check_err from test_util.hpp.

* Split convnd test into separate files for each dim.

* Fix data generation and use proper instances.

* Formatting

* Skip tensor initialization if not necessary.

* Fix CMakefiles.

* Remove redundant conv2d_fwd test.

* Lower problem size for conv3D UT.

* 3D case for convnd example.

* Remove leftovers after merge.

* Add Conv Specialization string to GetTypeString

* Skip instance causing numerical errors.

* Small fixes.

* Remove redundant includes.

* Fix namespace name error.

* Script for automatic testing and logging convolution fwd UTs

* Comment out numactl cmd.

* Refine weights initalization and relax rtol for fp16

* Move test_util.hpp to check_err.hpp

* Refine weights initalization and relax rtol for fp16

* Refactor common part of test conv utils.

* Move utility function to single common place.

* Add additional common functions to utility.

* Refactor convnd_fwd_xdl examples.

* Remove redundant files.
* Unify structure.

* Add constructor to ConvParams.

* And add input parameters validation.

* Modify conv examples to use single utility file.

* Remove check_error from host_tensor.hpp

* Get rid of check_indices function.

* Remove bf16_to_f32 function overload for scalars.

* Fix namespace.

* Add half_float::half for check_err.

* Fix conv params size in UT.

* Fix weights initialization for int8.

* Fix weights initialization for int8.

* Add type_convert when store output in ref conv 1D.

* Get back old conv2d_fwd_xdl operation.

* Silence conv debug print.

* format

* clean

* clean

* Fix merge.

* Fix namespace for check_err

* Formatting.

* Fix merge artifacts.

* Remove deleted header.

* Fix some includes and use ck::utils::check_err.

* Remove unused check_indices restored by previous merge.

* Fix namespaces after merge.

* Fix compilation error.

* Small fixes.

* Use common functions.
* Fix filename
* Fix namespaces.

* Fix merge artifact - retrieve removed by accident fun.

* Fix ConvForwardSpecialization.

* Adhere to coding style rules.

* Fix merge artifacts.

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-04-05 15:16:59 -05:00
Chao Liu
cd167e492a Compile for gfx908 and gfx90a (#130)
* adding compilation for multiple targets

* fix build

* clean

* update Jekinsfile

* update readme

* update Jenkins

* use ck::half_t instead of ushort for bf16

* rename enum classes

* clean

* rename

* clean
2022-03-31 12:33:34 -05:00
Adam Osewski
f91579aab6 Unified conv3D API + support for all data types. (#133)
* Convolution ND

* Code unification across dimensions for generating tensor descriptors.
* Example
* Instances

* Move convnd f32 instance file to comply with repo structure.

* Conv 1D tensor layouts.

* Formatting and use ReferenceConv

* Reference ConvFwd supporting 1D and 2D convolution.

* Debug printing TensorLayout name.

* Conv fwd 1D instance f32

* Refactor conv ND example.

Needed to support various conv dimensio.

Needed to support various conv dimensions

* Rename conv nd example director to prevent conflicts.

* Refactor some common utility to single file.

Plus some tests.

* Refactor GetHostTensorDescriptor + UT.

* Add 1D test case.

* Test reference convolution 1d/2d

* Remove some leftovers.

* Fix convolution example error for 1D

* Refactor test check errors utility function.

* Test Conv2D Fwd XDL

* More UT for 1D case.

* Parameterize input & weight initializers.

* Rename example to prevent conflicts.

* Split convnd instance into separate files for 1d/2d

* Address review comments.

* Fix data type for flops/gbytes calculations.

* Assign example number 11.

* 3D cases for convolution utility functions.

* 3D reference convolution.

* Add support for 3D convolution.

* Check for inputs bigger than  2GB.

* Formatting

* Support for bf16/f16/f32/i8 - conv instances + UT.

* Use check_err from test_util.hpp.

* Split convnd test into separate files for each dim.

* Fix data generation and use proper instances.

* Formatting

* Skip tensor initialization if not necessary.

* Fix CMakefiles.

* Remove redundant conv2d_fwd test.

* Lower problem size for conv3D UT.

* 3D case for convnd example.

* Remove leftovers after merge.

* Add Conv Specialization string to GetTypeString

* Skip instance causing numerical errors.

* Small fixes.

* Remove redundant includes.

* Fix namespace name error.

* Script for automatic testing and logging convolution fwd UTs

* Comment out numactl cmd.

* Refine weights initalization and relax rtol for fp16

* Fix weights initialization for int8.

* Add type_convert when store output in ref conv 1D.

* Get back old conv2d_fwd_xdl operation.

* Silence conv debug print.

* format

* clean

* clean

* Fix merge.

* Fix namespace for check_err

Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-03-23 10:23:13 -05:00
Chao Liu
5d37d7bff4 Reorganize files, Part 1 (#119)
* delete obselete files

* move files

* build

* update cmake

* update cmake

* fix build

* reorg examples

* update cmake for example and test
2022-03-08 21:46:36 -06:00