linqunAMD
8811c57d44
[ck_tile] remove duplicate functions in ck_tile ( #3311 )
...
* [ck_tile] remove duplicated shuffle_b and shuffle_b_permuteN
* [ck_tile] move get_k_warp to gemm_shape
* resolve code rebase error
[ROCm/composable_kernel commit: 6d7299ff78 ]
2025-12-15 07:13:00 -08:00
Khushbu Agarwal
5ab9a6cfe4
[CK_Tile] Enable PreshuffleB for 2d block scale Gemm ( #3298 )
...
* formatted
* formatted
* formatting
* formatting
* formatting
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Split cpp file to reduce building time
- Support multiple GemmConfig
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update Readme
* enable prefill shapes
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Add support for rowcol and tensor GEMM operations
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update README
* adding preshuffle quant as new parameter and its associated new files
* remove debugging statements
* adding test
* enable preshuffle quant with permuteN
* updating readme and correcponding gemmconfigs
* updating cmake file
* fixing CI failures for grouped quant gemm
* debugging permuteN
* debugging
* debugging PermuteN
* initial commit
* resolving merge conflicts
* adding test cases
* fixing bq tensor calculation
---------
Co-authored-by: Cong Ma <congma13@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
[ROCm/composable_kernel commit: 6b1bceca7b ]
2025-12-05 09:57:52 -08:00
Aviral Goel
216c23b945
chore(copyright): update copyright header for include directory ( #3293 )
...
[ROCm/composable_kernel commit: de6466481f ]
2025-11-26 11:00:05 -07:00
Khushbu Agarwal
7d6cd1f3c4
[CK_Tile] Support for preshuffle weight(B) quant tensor for block scale gemm ( #3165 )
...
* formatted
* formatted
* formatting
* formatting
* formatting
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Split cpp file to reduce building time
- Support multiple GemmConfig
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update Readme
* enable prefill shapes
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Add support for rowcol and tensor GEMM operations
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update README
* adding preshuffle quant as new parameter and its associated new files
* remove debugging statements
* adding test
* enable preshuffle quant with permuteN
* updating readme and correcponding gemmconfigs
* updating cmake file
* fixing CI failures for grouped quant gemm
* addressing review comments
* fixing CI issue
* addressing reveiw comments
* formatting
* formatting
* fixing aquant operator overlaoding
* formatting
---------
Co-authored-by: Cong Ma <congma13@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
[ROCm/composable_kernel commit: 8111572785 ]
2025-11-24 07:48:42 -08:00
linqunAMD
13cf0bd17f
[CK_TILE] Fix gemm_quant ( #3186 )
...
[ROCm/composable_kernel commit: 1b1c46e508 ]
2025-11-11 08:23:57 -08:00
Enrico Degregori
e6be7bcc2a
WMMA gemm_add_relu_add_layernorm ( #2989 )
...
* Summary:
- Refactor epilogue (with CShuffle) to support fused operations:
- EpilogueCShuffleBase holds common parts
- EpilogueCShuffle: runs CShuffle and write out
- EpilogueWelfordCShuffle: holds Welford specific arguments, runs CShuffle, write out, Welford first part and Welford write out
- Extend thread transfer v7r3:
- Support for intermediate data type different from src and dst type
- New functionality to write to dst buffer and keep data (to be able to use them for additional operations)
* Adress review comments
[ROCm/composable_kernel commit: 4ebc48a3cd ]
2025-10-31 11:19:26 -07:00
Khushbu Agarwal
35cb7500e4
Fix quant scale matrix layout for block scale gemm ( #3079 )
...
* Adding support for TiledPermuteN
* Adding test
* moving shuffle functions to common place
* resolving commit hook
* fix formatting
[ROCm/composable_kernel commit: b11f53a484 ]
2025-10-27 13:56:07 -07:00