mirror of
https://github.com/kvcache-ai/sglang.git
synced 2026-07-01 04:08:10 +00:00
Co-authored-by: Rishitshivam <164783543+Rishitshivam@users.noreply.github.com> Co-authored-by: Ratish P <114130421+Ratish1@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Adarsh Shirawalmath <114558126+adarshxs@users.noreply.github.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
870 B
870 B
Use Models From ModelScope
To use a model from ModelScope, set the environment variable SGLANG_USE_MODELSCOPE.
export SGLANG_USE_MODELSCOPE=true
We take Qwen2-7B-Instruct as an example.
Launch the Server:
python -m sglang.launch_server --model-path qwen/Qwen2-7B-Instruct --port 30000
Or start it by docker:
docker run --gpus all \
-p 30000:30000 \
-v ~/.cache/modelscope:/root/.cache/modelscope \
--env "SGLANG_USE_MODELSCOPE=true" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-7B-Instruct --host 0.0.0.0 --port 30000
Note that modelscope uses a different cache directory than huggingface. You may need to set it manually to avoid running out of disk space.