Commit Graph

3 Commits

Author SHA1 Message Date
Adam Osewski
c747be612f Refactor device op implementations into impl subdirectory. (#420)
* Move kernel implementation files under impl directory.

* Update examples paths.

* Update device kernel impl include paths.

* Update tensor operation instances include paths.

* Update profiler and tests include paths.

* Clang-format

* Update include paths for batched gemm reduce

* Refactor UnitTest ConvNDBwdWeight.

* Refactor fwd and bwd data convND UT.

* Fix used test macro.

* Fix include path.

* Fix include paths.

* Fix include paths in profiler and tests.

* Fix include paths.

Co-authored-by: Adam Osewski <aosewski@amd.com>

[ROCm/composable_kernel commit: 3048028897]
2022-10-13 09:05:08 -05:00
Qianfeng
24eab22995 Add int4 reduction examples (#372)
* Add int4 reduction examples

* Contain all using of int4_t inside the pre-compiling condition checking

[ROCm/composable_kernel commit: d520d0cfc1]
2022-08-25 16:58:48 -05:00
Qianfeng
23dc96e13c Add examples for reduction fp16/fp32/bp16/int8/fp64 for 3d/4d/5d (#342)
* Update the reduce_blockwise example to support user specified data type and input+reducing dimensions

* Add examples for using reduce_multiblock_atomic_add

* Add more running examples to the default command-line

* Remove un-necessary header including

* Update to the example README.md

[ROCm/composable_kernel commit: 14932e8de3]
2022-08-13 01:10:01 -05:00