mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-05-12 01:10:08 +00:00
1. Add 'pragma once' preprocess directive 2. Replace prmt PTX with __byte_perm intrinsic Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
1. Add 'pragma once' preprocess directive 2. Replace prmt PTX with __byte_perm intrinsic Signed-off-by: Peter Han <fujun.han@iluvatar.ai>