Files
sglang/docs_new/docs/basic_usage/gpt_oss.mdx
Mingyi a3291b5654 Add new Mintlify documentation site (docs_new/) (#23001)
Co-authored-by: AdityaVKochar <adityavardhankochar@gmail.com>
Co-authored-by: mintlify[bot] <109931778+mintlify[bot]@users.noreply.github.com>
Co-authored-by: adhyan-jain <adhyanjain2006@gmail.com>
Co-authored-by: Adhyan Jain <71976554+adhyan-jain@users.noreply.github.com>
Co-authored-by: Maitri-shah29 <maitrirajivshah@gmail.com>
Co-authored-by: Adarsh Shirawalmath <114558126+adarshxs@users.noreply.github.com>
Co-authored-by: Maitri Shah <shah29maitri@gmail.com>
Co-authored-by: Aditya Vardhan Kochar <80113212+AdityaVKochar@users.noreply.github.com>
Co-authored-by: Rishit Shivam <164783543+pokymono@users.noreply.github.com>
Co-authored-by: Rishitshivam <164783543+Rishitshivam@users.noreply.github.com>
Co-authored-by: IshhanKheria <ishhankheria06@gmail.com>
Co-authored-by: Ishita Joshi <ishitata.joshi@gmail.com>
Co-authored-by: Richard Chen <104477092+Richardczl98@users.noreply.github.com>
Co-authored-by: longGGGGGG <553746008@qq.com>
Co-authored-by: Richard <richardchen@radixark.ai>
Co-authored-by: Nakul Sinha <nakul.new4socials@gmail.com>
Co-authored-by: Divyam Agrawal <ludicrouslytrue@gmail.com>
Co-authored-by: Richardczl98 <Zhenlinc@stanford.edu>
Co-authored-by: Krishang Zinzuwadia <krishangzinzuwadia@gmail.com>
Co-authored-by: nimeshas <nimesha.s106@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jignas Paturu <86356085+JignasP@users.noreply.github.com>
Co-authored-by: zijiexia <37504505+zijiexia@users.noreply.github.com>
2026-04-20 15:10:22 -07:00

182 lines
8.6 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "GPT OSS Usage"
metatags:
description: "Deploy GPT-OSS with SGLang: OpenAI Responses API compatible, built-in tools for web search and Python execution, reasoning levels, MCP tool server support."
---
Please refer to [#8833](https://github.com/sgl-project/sglang/issues/8833).
## Responses API & Built-in Tools
### Responses API
GPTOSS is compatible with the OpenAI Responses API. Use `client.responses.create(...)` with `model`, `instructions`, `input`, and optional `tools` to enable builtin tool use. You can set reasoning level via `instructions`, e.g., "Reasoning: high" (also supports "medium" and "low") — levels: low (fast), medium (balanced), high (deep).
### Built-in Tools
GPTOSS can call builtin tools for web search and Python execution. You can use the demo tool server or connect to external MCP tool servers.
#### Python Tool
- Executes short Python snippets for calculations, parsing, and quick scripts.
- By default runs in a Docker-based sandbox. To run on the host, set `PYTHON_EXECUTION_BACKEND=UV` (this executes model-generated code locally; use with care).
- Ensure Docker is available if you are not using the UV backend. It is recommended to run `docker pull python:3.11` in advance.
#### Web Search Tool
- Uses the Exa backend for web search.
- Requires an Exa API key; set `EXA_API_KEY` in your environment. Create a key at `https://exa.ai`.
### Tool & Reasoning Parser
- We support OpenAI Reasoning and Tool Call parser, as well as our SGLang native api for tool call and reasoning. Refer to [reasoning parser](../advanced_features/separate_reasoning) and [tool call parser](../advanced_features/tool_parser) for more details.
## Notes
- Use **Python 3.12** for the demo tools. And install the required `gpt-oss` packages.
- The default demo integrates the web search tool (Exa backend) and a demo Python interpreter via Docker.
- For search, set `EXA_API_KEY`. For Python execution, either have Docker available or set `PYTHON_EXECUTION_BACKEND=UV`.
Examples:
```bash Command
export EXA_API_KEY=YOUR_EXA_KEY
# Optional: run Python tool locally instead of Docker (use with care)
export PYTHON_EXECUTION_BACKEND=UV
```
Launch the server with the demo tool server:
```bash Command
python3 -m sglang.launch_server \
--model-path openai/gpt-oss-120b \
--tool-server demo \
--tp 2
```
For production usage, sglang can act as an MCP client for multiple services. An [example tool server](https://github.com/openai/gpt-oss/tree/main/gpt-oss-mcp-server) is provided. Start the servers and point sglang to them:
```bash Command
mcp run -t sse browser_server.py:mcp
mcp run -t sse python_server.py:mcp
python -m sglang.launch_server ... --tool-server ip-1:port-1,ip-2:port-2
```
The URLs should be MCP SSE servers that expose server information and well-documented tools. These tools are added to the system prompt so the model can use them.
## Speculative Decoding
SGLang supports speculative decoding for GPT-OSS models using EAGLE3 algorithm. This can significantly improve decoding speed, especially for small batch sizes.
**Usage**:
Add `--speculative-algorithm EAGLE3` along with the draft model path.
```bash Command
python3 -m sglang.launch_server \
--model-path openai/gpt-oss-120b \
--speculative-algorithm EAGLE3 \
--speculative-draft-model-path lmsys/EAGLE3-gpt-oss-120b-bf16 \
--tp 2
```
<Tip>
To enable the experimental overlap scheduler for EAGLE3 speculative decoding, set the environment variable `SGLANG_ENABLE_SPEC_V2=1`. This can improve performance by enabling overlap scheduling between draft and verification stages.
</Tip>
### Quick Demo
```python Example
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:30000/v1",
api_key="sk-123456"
)
tools = [
{"type": "code_interpreter"},
{"type": "web_search_preview"},
]
# Reasoning level example
response = client.responses.create(
model="openai/gpt-oss-120b",
instructions="You are a helpful assistant."
reasoning_effort="high" # Supports high, medium, or low
input="In one sentence, explain the transformer architecture.",
)
print("====== reasoning: high ======")
print(response.output_text)
# Test python tool
response = client.responses.create(
model="openai/gpt-oss-120b",
instructions="You are a helfpul assistant, you could use python tool to execute code.",
input="Use python tool to calculate the sum of 29138749187 and 29138749187", # 58,277,498,374
tools=tools
)
print("====== test python tool ======")
print(response.output_text)
# Test browser tool
response = client.responses.create(
model="openai/gpt-oss-120b",
instructions="You are a helfpul assistant, you could use browser to search the web",
input="Search the web for the latest news about Nvidia stock price",
tools=tools
)
print("====== test browser tool ======")
print(response.output_text)
```
Example output:
```text Output
====== test python tool ======
The sum of 29,138,749,187 and 29,138,749,187 is **58,277,498,374**.
====== test browser tool ======
**Recent headlines on Nvidia (NVDA) stock**
<table style={{width: "100%", borderCollapse: "collapse", tableLayout: "fixed"}}>
<colgroup>
<col style={{width: "25%"}} />
<col style={{width: "25%"}} />
<col style={{width: "25%"}} />
<col style={{width: "25%"}} />
</colgroup>
<thead>
<tr style={{borderBottom: "2px solid #d55816"}}>
<th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Date (2025)</th>
<th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Source</th>
<th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Key news points</th>
<th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Stockprice detail</th>
</tr>
</thead>
<tbody>
<tr>
<td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**May13**</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Reuters</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>The market data page shows Nvidia trading “higher” at **$116.61** with no change from the previous close.</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>**$116.61** latest trade (delayed15min)【14†L34-L38】</td>
</tr>
<tr>
<td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**Aug18**</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>CNBC</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>MorganStanley kept an **overweight** rating and lifted its price target to **$206** (up from $200), implying a 14% upside from the Friday close. The firm notes Nvidia shares have already **jumped 34% this year**.</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>No exact price quoted, but the article signals strong upside expectations【9†L27-L31】</td>
</tr>
<tr>
<td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**Aug20**</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>The MotleyFool</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>Nvidia is set to release its Q2 earnings on Aug27. The article lists the **current price of $175.36**, down 0.16% on the day (as of 3:58p.m.ET).</td>
<td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>**$175.36** current price on Aug20【10†L12-L15】【10†L53-L57】</td>
</tr>
</tbody>
</table>
**What the news tells us**
* Nvidias share price has risen sharply this year up roughly a third according to MorganStanley and analysts are still raising targets (now $206).
* The most recent market quote (Reuters,May13) was **$116.61**, but the stock has surged since then, reaching **$175.36** by midAugust.
* Upcoming earnings on **Aug27** are a focal point; both the MotleyFool and MorganStanley expect the results could keep the rally going.
**Bottom line:** Nvidias stock is on a strong upward trajectory in 2025, with price targets climbing toward $200$210 and the market price already near $175 as of late August.
```