mirror of
https://github.com/kvcache-ai/sglang.git
synced 2026-06-30 03:37:51 +00:00
DeepseekV32ForCausalLM was missing from the model_arch guard in
_handle_model_specific_adjustments(), so is_deepseek_nsa() was never
reached for V3.2 models. This caused the NSA attention backend to not
be auto-selected, leading to q_rope TypeError with flashinfer or
incorrect behavior with other backends.
Upstream bug introduced in sgl-project/sglang#13687 (commit 618ca2380)
which refactored the flat is_deepseek_nsa() check into a nested block
under model_arch guard but only listed DeepseekV3ForCausalLM.