Commit Graph

713 Commits

Author SHA1 Message Date
Po-Yen, Chen
990eed11b7 Handle the case while user specify all the strides 2022-08-19 16:32:37 -04:00
Po-Yen, Chen
7558d14442 Fix wrong program return value of GEMM examples 2022-08-19 16:29:48 -04:00
Po-Yen, Chen
1ce791ea05 Use more strict condition to add code in examples 2022-08-19 15:39:50 -04:00
Po-Yen, Chen
75a30f8b18 Mark Tensor<> special member functions as 'default' 2022-08-19 15:30:43 -04:00
Po-Yen, Chen
1626a6e376 Remove unnecessary copy ctor for Tensor<> 2022-08-19 15:27:21 -04:00
Po-Yen, Chen
cd395646fa Fix compilation error in check_err() 2022-08-19 15:22:26 -04:00
Po-Yen, Chen
47770c857b Allow unsigned integer arguments for check_err() 2022-08-19 15:19:34 -04:00
Po-Yen, Chen
3b0f97f6eb Revert "Add type traits 'is_signed_integral<>'"
This reverts commit f2c148efae.
2022-08-19 15:14:12 -04:00
Po-Yen, Chen
103ae7d126 Use reinterpret_cast<>() for cross-type pointer conversion 2022-08-19 15:01:32 -04:00
Po-Yen, Chen
a177ad758f Unify structured comment in examples 2022-08-19 14:57:21 -04:00
Po-Yen, Chen
e37f4ab9cc Re-format common.hpp 2022-08-19 14:50:44 -04:00
Po-Yen, Chen
f7288bc2b1 Reuse same implementation code for most of GEMM examples 2022-08-19 14:47:09 -04:00
Po-Yen, Chen
ed51c0638b Re-format template argument in example code 2022-08-19 14:31:46 -04:00
Po-Yen, Chen
5931c7ebe6 Move common codes together 2022-08-19 13:49:22 -04:00
Po-Yen, Chen
68a57e71e6 Move #include directives into new header 2022-08-19 13:24:00 -04:00
Po-Yen, Chen
42d75f356c Sort include directives 2022-08-19 12:59:46 -04:00
Po-Yen, Chen
dd5b139401 Extract int4 example common codes 2022-08-19 12:57:36 -04:00
Po-Yen, Chen
3e2f37a148 Re-format GEMM instance template arguments 2022-08-19 12:02:57 -04:00
Po-Yen, Chen
c1fbabea04 Avoid too much generalizing check_err() 2022-08-19 11:59:21 -04:00
Po-Yen, Chen
4d4a659cd6 Use ""_uz to simplify example code 2022-08-19 11:54:51 -04:00
Po-Yen, Chen
3e2371c554 Align design with other PR 2022-08-19 11:44:08 -04:00
Po-Yen, Chen
503f07c1e0 Add constraint to check_err() input reference type 2022-08-19 11:34:19 -04:00
Po-Yen, Chen
2fb766e852 Simplify tensor usages in examples 2022-08-19 11:33:25 -04:00
Po-Yen, Chen
0d5025befe Add #error directive to prevent compile sources with wrong setting 2022-08-19 10:51:30 -04:00
Po-Yen, Chen
625f95ade4 Remove debug messages 2022-08-19 10:05:44 -04:00
Po-Yen, Chen
84843aa36f Avoid compilation error while disabling ck::int4_t support 2022-08-19 09:54:03 -04:00
Po-Yen, Chen
51d0c6794c Remove constraint of Tensor<>::CopyAsType() 2022-08-19 05:31:04 -04:00
Po-Yen, Chen
c34f8411c4 Check converted Tensor<int4_t> with golden Tensor<int8_t> 2022-08-19 04:40:13 -04:00
Po-Yen, Chen
a83c006098 Allow comparing different-sized integral types in check_err() 2022-08-19 04:39:20 -04:00
Po-Yen, Chen
726c115393 Add type constraints for integer version check_err<>() 2022-08-19 03:48:20 -04:00
Po-Yen, Chen
f2c148efae Add type traits 'is_signed_integral<>' 2022-08-19 03:47:22 -04:00
Po-Yen, Chen
463d15f9b5 Add constraint to Tensor<> templated methods 2022-08-19 03:27:41 -04:00
Po-Yen, Chen
f3f61f836b Complete the int4 examples 2022-08-19 02:19:50 -04:00
Po-Yen, Chen
2dc3357a20 Fix typo in alias names 2022-08-19 01:41:20 -04:00
Po-Yen, Chen
79480f0aee Re-use element-wise operation type alias 2022-08-19 01:39:46 -04:00
Po-Yen, Chen
dd849a8736 Re-use CopyAsType<>() to implement copy ctor 2022-08-19 01:02:36 -04:00
Po-Yen, Chen
e03cece9c4 Use different type for host tensors 2022-08-19 00:32:57 -04:00
Po-Yen, Chen
89a827cab9 Re-format source files 2022-08-19 00:32:24 -04:00
Po-Yen, Chen
cbbe2485b2 Allow conversion between Tensor<> specializations 2022-08-19 00:30:53 -04:00
Po-Yen, Chen
30ed3e218c Add int4_t support for check_err() 2022-08-19 00:30:28 -04:00
Po-Yen, Chen
194faf7837 Distinguish user-side type from kernel-side type 2022-08-18 23:43:19 -04:00
Po-Yen, Chen
70c87970ec Re-use pre-defined alias in int4 exmples 2022-08-18 23:29:38 -04:00
Po-Yen, Chen
4b153bd974 Add GEMM examples for int4
Currently the source files are just copied from int8 examples
2022-08-18 23:03:36 -04:00
Illia Silin
9efd033bee restart the stages on MI200 in case of failures (#366)
* restart the stages on MI200

* fix the docker image storage issue
2022-08-18 14:54:47 -05:00
Adam Osewski
e00149ac67 int4 data type (#364)
* Introduce int4 data type.

* Add unit-tests for int4

* Compile int4 UT only when int4 enabled.

* clang-format

Co-authored-by: Adam Osewski <aosewski@amd.com>
2022-08-18 14:53:47 -05:00
Chao Liu
bac7df8faf use scale (#363) 2022-08-17 10:38:00 -05:00
Anthony Chang
c961ce9226 Hotfix LDS data hazard in fused attention (#360)
* avoid LDS data hazard in gemm_softmax_gemm pipeline

* trivial refactors

* comments

* shrink blockwise gemm v2 thread buffer size

* reclaim A block lds space when during 2nd gemm

* amend

* amend
2022-08-15 12:04:20 -05:00
Qianfeng
53ea4713af Batchnorm-forward and Batchnorm-infer Implemented using generic kernels (#320)
* Implement multiple-reduction in one kernel (kernels, device ops, examples)

* Add generic elementwise kernel and device interface

* Add generator for normal-distributed data initialization

* Add host refer implementation of batchnorm-forward and batchnorm-infer

* Add examples for implementing batchnorm-forward and batchnorm-infer using generic kernels

* Remove un-needed including in batchnorm example

* Renaming generic_elementwise to elementiwise in kernel and device classes/functions

* Change in gemm_layernorm examples to use DeviceElementwise instead of Device5AryElementwise

* Change in exampe 19_binary_elementwise to use DeviceElementwise instead of DeviceBinaryElementwise

* Change in device_cgemm_4gemm_xdl_cshuffle.hpp to use kernel_elementwise instead of kernel_binary_elementwise

* Add DeviceElementwiseBase and use it in device_normalize_instance.cpp

* Removing and renaming files

* Update to synchronize gemm_layernorm client example to the generic element-wise device op API

* Update to synchronize with the latest headers directory and HostTensorDescriptor interface renaming

* Merge two static member functions in device_elementwise.hpp

* Remove unary_elementwise_1d kernel and device
2022-08-15 10:11:02 -05:00
Chao Liu
5ee304595c fix build issue (#357)
* fix build

* excludeexample_gemm_max_xdl_fp16 from testing due to random failure on gfx908
2022-08-13 15:58:31 -05:00
cloudhan
fb1cbf025b Change all device operations to use add_instance_library (#338)
* Change all device operations to use add_instance_library to avoid duplicated cmake configuration.

* update DeviceMem

Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-08-13 12:17:58 -05:00