* chore(copyright): update copyright header for test directory
* chore(copyright): update copyright header for test directory
* chore(copyright): update copyright header for client_example directory
* chore(copyright): update copyright header for test directory
[ROCm/composable_kernel commit: c8563f2101]
* Wrap ck host utitlies in CK namespace.
The CK and CK-Tile source code bases are incompatible because CK is not properly using namespaces everywhere. In particular, we need to put hip_check_error in the ck namespace.
Move all functions in include/ck_/host_utility that were in global namespace into the ck namespace.
There may be additional namespace problems like this, and it's possible we'll have namespace clashes. But it is good design to properly guard our to code bases (CK and CKTile) so that they can both coexist. Moreover, estabilishing this compatiblity is essential if we are going to allow the builder to instantiate kernels from either template library.
* Add using declarations to test code.
After moving some of the untils into the ck namespace, most examples and a few tests had to be updated to recognize the new namespace declarations. We add using declarations to individual compute units for functions that were previously in the global namespace.
* Add using declarations to client examples.
[ROCm/composable_kernel commit: ad57f6ef0b]
* GH-2368 Adding a basic glossary
GH-2368 Minor edits
GH-2368 Adding missing READMEs and standardization.
resolving readme updates
GH-2368 Minor improvements to documentation.
Improving some readmes.
Further improvement for readmes.
Cleaned up the documentation in 'client_example' (#2468)
Update for PR
Update ACRONYMS.md to remove trivial terms
Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats
Apply suggestion from @spolifroni-amd
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
Apply suggestion from @spolifroni-amd
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.
revise 37_transpose readme
revise 36_copy readme
Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.
Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.
Remove references to the Tile Engine in README files across multiple examples
* GH-2368 Adding a basic glossary
GH-2368 Minor edits
GH-2368 Adding missing READMEs and standardization.
resolving readme updates
GH-2368 Minor improvements to documentation.
Improving some readmes.
Further improvement for readmes.
Cleaned up the documentation in 'client_example' (#2468)
Update for PR
Update ACRONYMS.md to remove trivial terms
Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats
Apply suggestion from @spolifroni-amd
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
Apply suggestion from @spolifroni-amd
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.
revise 37_transpose readme
revise 36_copy readme
Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.
Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.
Remove references to the Tile Engine in README files across multiple examples
Refine README files by removing outdated references to the Tile Engine
* Updates based on PR feedback 1
* Updates based on PR feedback 2
* Updates based on PR feedback 3
* Updates based on PR feedback 4
* Updates based on PR feedback 5
* Updates based on PR feedback 6
* Updates based on PR feedback 7
* Updates based on PR feedback 8
* Content Modification of CK Tile Example
* Modify the ck_tile gemm config
---------
Co-authored-by: AviralGoelAMD <aviral.goel@amd.com>
Co-authored-by: ThomasNing <thomas.ning@amd.com>
[ROCm/composable_kernel commit: 92c67a824f]
* reduce the docker image size and layers
* clean up docker file
* fix linker error for client example 24
* install CK into the default /opt/rocm/ path
* restore installing CK to alternative path in CI
* add linking for utility lib
[ROCm/composable_kernel commit: d31e8249c1]
* Add CMakePresets configurations.
* Add ConvScale+ReLU Functor and an Example
* Account for ReLU FLOPs.
* Add instances of 3D convolutions with ConvscaleRelu operation.
* Implement Client Example
* Cleanup
[ROCm/composable_kernel commit: 802a8a1df1]
We are adding more instances of grouped convolution 3d forward with a ConvScale element-wise operation.
This commit handles bf8@bf8->fp8 data types combination.
* Included an example.
* Added instances.
* Added a client example.
---------
Co-authored-by: Rostyslav Geyyer <rosty.geyyer@amd.com>
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
[ROCm/composable_kernel commit: 05b10e0e5a]
* Update the element op
* Add an example
* Add instances
* Add a client example
* make sure new instances only build on gfx9
* Update element op and its handling
* Format
* Update instances to take element op as an argument
* Update examples to use random scale values
* Format
* Update client example with random scales
* Format
---------
Co-authored-by: illsilin <Illia.Silin@amd.com>
[ROCm/composable_kernel commit: ce66277a76]
* Add a scale op
* Update the element op
* Add instances
* Add an example
* Add a client example
* Add a flag check
* Revert flag check addition
* Fix flag check
* Update d strides in example
* Update d strides in client example
* Apply suggestions from code review
Update copyright header
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
* Move the example
* Move the client example
* Update element op
* Update example with the new element op
* Add scalar layout
* Update example
* Update kernel for scalar Ds
* Revert kernel changes
* Update element op
* Update example to use scales' pointers
* Format
* Update instances
* Update client example
* Move element op to unary elements
* Update element op to work with values instead of pointers
* Update instances to take element op as an argument
* Update examples to use random scale values
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
[ROCm/composable_kernel commit: cb0645bedc]
* parse examples inside the add_example_executable function
* fix the example 64 cmake file
* add xdl flag to the gemm_bias_softmax_gemm_permute example
* add filtering of tests based on architecture type
* enable test_grouped_gemm for gfx9 only
* enable test_transpose only for gfx9
* only linnk test_transpose if it gets built
* split the gemm instances by architectures
* split gemm_bilinear,grouped_conv_bwd_weight instances by targets
* split instances by architecture
* split grouped_conv instances by architecture
* fix clang format
* fix the if-else logic in group_conv headers
* small fix for grouped convolution instances
* fix the grouped conv bwd weight dl instances
* fix client examples
* only enable client examples 3 and 4 on gfx9
* set the gfx9 macro
* make sure the architecture macros are set by cmake
* use separate set of xdl/wmma flags for host code
* sinmplify the main cmake file
* add conv_fwd_bf8 instance declaration
[ROCm/composable_kernel commit: ae57e5938e]