mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-24 14:54:47 +00:00
* Sync the order of type string with template parameter * Add more instances * Check the vector size and remove redundant var * Extract var to static, prepare to separate sweep once kernel * Separate sweeponce flow and optimize the flow * 1. Rename AccDatatype in normalization to computeData 2. Rename AccElementwiseOperation to YElementwiseOperation in normalization * Remove useless code * Update naive variance kernel * Refine string * Fix typo * Support naive variance for device_normalization * Check the blocksize * Share the VGPR of x and y * Share the VGPR of gamma and beta * Add more instances * Support fp16 sqrt for experiment * Add CHANGELOG * Fix typo * clang-format
24 lines
652 B
Markdown
24 lines
652 B
Markdown
# Change Log for Composable Kernel
|
|
|
|
Full documentation for Composable Kernel is not yet available.
|
|
|
|
## CK 0.1.1 for ROCm 5.5.0
|
|
|
|
### Fixed
|
|
- Fixed a bug in 6-dimensional kernels (#555).
|
|
- Fixed grouped ConvBwdWeight test case failure (#524).
|
|
|
|
### Optimizations
|
|
- Improve proformance of normalization kernel
|
|
|
|
### Added
|
|
- Added user tutorial (#563).
|
|
- Added more instances for irregular GEMM sizes (#560).
|
|
- Added inter-wave consumer-producer programming model for GEMM kernels (#310).
|
|
- Added multi-D GEMM client APIs (#534).
|
|
- Added multi-embeddings support (#542).
|
|
- Added Navi3x blockwise GEMM and real GEMM support (#541).
|
|
|
|
### Changed
|
|
- Changed ...
|