mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-06-29 19:07:07 +00:00
* v4.3.3 update. * fix print_layout printf format in device code (#2688) * fix print_layout printf format in device code * Replace %.*s format specifier with explicit loop * Remove unused delim variable The printf format %.*s with dynamic width does not work correctly in CUDA device code, causing literal %.*s to appear in output. Fixes #2496 * Update include/cute/util/print_tensor.hpp Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com> * Update include/cute/util/print_tensor.hpp Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com> --------- Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com> * Support PDL for SM90 Array TMA GEMM * Update changelog --------- Co-authored-by: Amin Sedaghat <35748194+Aminsed@users.noreply.github.com> Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com>