Yi DING
8e3d84aba3
[CK_TILE] ABQuant New Preshuffle ( #3638 )
...
* Refactor
* Gemm quant improvement
* Change preshuffle
* Fix
* Fix grouped gemm ut
* Fix
---------
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2026-01-27 23:46:49 -08:00
Thomas Ning
00c46785a8
Shuffle fix for gfx950 ( #3491 )
...
* solve compiler issue
* solve the gfx950 mfma shuffle regression
* refactor jenkinsfile to handle arch name better
* [CK TILE] set divisor to count of thread along k dimension
* fix the compiler error
* solve degradation
* Finish the multiplies fix
* fix the scales
* solve compilation error
* solve the composes
* solve the error of tile sweeper
* fix the test and example
* fix for gfx950
---------
Co-authored-by: Max Podkorytov <4273004+tenpercent@users.noreply.github.com >
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com >
Co-authored-by: Cong Ma <congma13@amd.com >
2026-01-13 09:21:29 -08:00
linqunAMD
6d7299ff78
[ck_tile] remove duplicate functions in ck_tile ( #3311 )
...
* [ck_tile] remove duplicated shuffle_b and shuffle_b_permuteN
* [ck_tile] move get_k_warp to gemm_shape
* resolve code rebase error
2025-12-15 07:13:00 -08:00
Khushbu Agarwal
6b1bceca7b
[CK_Tile] Enable PreshuffleB for 2d block scale Gemm ( #3298 )
...
* formatted
* formatted
* formatting
* formatting
* formatting
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Split cpp file to reduce building time
- Support multiple GemmConfig
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update Readme
* enable prefill shapes
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Add support for rowcol and tensor GEMM operations
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update README
* adding preshuffle quant as new parameter and its associated new files
* remove debugging statements
* adding test
* enable preshuffle quant with permuteN
* updating readme and correcponding gemmconfigs
* updating cmake file
* fixing CI failures for grouped quant gemm
* debugging permuteN
* debugging
* debugging PermuteN
* initial commit
* resolving merge conflicts
* adding test cases
* fixing bq tensor calculation
---------
Co-authored-by: Cong Ma <congma13@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2025-12-05 09:57:52 -08:00
Aviral Goel
de6466481f
chore(copyright): update copyright header for include directory ( #3293 )
2025-11-26 11:00:05 -07:00
Khushbu Agarwal
8111572785
[CK_Tile] Support for preshuffle weight(B) quant tensor for block scale gemm ( #3165 )
...
* formatted
* formatted
* formatting
* formatting
* formatting
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Split cpp file to reduce building time
- Support multiple GemmConfig
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update Readme
* enable prefill shapes
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Add support for rowcol and tensor GEMM operations
* [CK TILE GEMM] Refactor block_scale_gemm examples
- Update README
* adding preshuffle quant as new parameter and its associated new files
* remove debugging statements
* adding test
* enable preshuffle quant with permuteN
* updating readme and correcponding gemmconfigs
* updating cmake file
* fixing CI failures for grouped quant gemm
* addressing review comments
* fixing CI issue
* addressing reveiw comments
* formatting
* formatting
* fixing aquant operator overlaoding
* formatting
---------
Co-authored-by: Cong Ma <congma13@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2025-11-24 07:48:42 -08:00
linqunAMD
1b1c46e508
[CK_TILE] Fix gemm_quant ( #3186 )
2025-11-11 08:23:57 -08:00
Enrico Degregori
4ebc48a3cd
WMMA gemm_add_relu_add_layernorm ( #2989 )
...
* Summary:
- Refactor epilogue (with CShuffle) to support fused operations:
- EpilogueCShuffleBase holds common parts
- EpilogueCShuffle: runs CShuffle and write out
- EpilogueWelfordCShuffle: holds Welford specific arguments, runs CShuffle, write out, Welford first part and Welford write out
- Extend thread transfer v7r3:
- Support for intermediate data type different from src and dst type
- New functionality to write to dst buffer and keep data (to be able to use them for additional operations)
* Adress review comments
2025-10-31 11:19:26 -07:00
Khushbu Agarwal
b11f53a484
Fix quant scale matrix layout for block scale gemm ( #3079 )
...
* Adding support for TiledPermuteN
* Adding test
* moving shuffle functions to common place
* resolving commit hook
* fix formatting
2025-10-27 13:56:07 -07:00