[CK] Fix FMHA sink dispatch when init_sink_value is set
(#7530)
## Summary
- Fix `traits.has_sink` in `fmha_fwd_runner.hpp` to also check
`init_sink_value != 0`, so the GPU kernel dispatches with sink support
when `-init_sink=1` is passed.
- Gate `run_sink_mask_tests` (StreamLLM) and `run_sink_init_tests`
(GPT-OSS) behind opt-in flags `-m` and `-g` in `smoke_test_fwd.sh`.
These tests require sink=true kernel instances which are excluded by the
`BUILD_TESTING` CMake filter (`*_nsink*`), causing unconditional "not
supported yet" failures (48 tests in CI). The opt-in flag approach was
borrowed from PR #6057.
## Why gate tests instead of compiling sink=true kernels?
The `BUILD_TESTING` filter in `CMakeLists.txt` uses `*_nsink*` glob
patterns for the `fwd` and `fwd_splitkv` APIs, excluding sink=true
kernel instances from compilation. We chose opt-in flags over widening
the filter because:
- **Compile time**: Enabling sink=true kernels doubles the kernel
variants for `fwd` and `fwd_splitkv` APIs. The filter exists
specifically to reduce CI build times.
- **Incremental enablement**: Sink support (StreamLLM / GPT-OSS) is
still maturing. Gating lets teams opt in explicitly (`smoke_test_fwd.sh
-g`) while keeping the default CI path fast.
- **Precedent**: splitkv (`-s`) and appendkv (`-a`) tests already follow
this opt-in pattern.
## Test plan
- [ ] Run `smoke_test_fwd.sh -g` with sink=true kernels compiled and
verify sink-enabled kernels are dispatched
- [ ] Verify `smoke_test_fwd.sh` still passes without `-m` / `-g` flags
- [ ] Confirm CI no longer fails on sink tests (they are now opt-in)