Model: Auto-scale max_tokens by default

If max_tokens is None, it automatically scales to fill up the context.
This does not mean the generation will fill up that context since
EOS stops also exist.

Originally suggested by #86

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri
2024-03-18 22:54:59 -04:00
parent 8cbb59d6e1
commit 09a4c79847
2 changed files with 28 additions and 20 deletions

View File

@@ -14,7 +14,7 @@ class BaseSamplerRequest(BaseModel):
"""Common class for sampler params that are used in APIs"""
max_tokens: Optional[int] = Field(
default_factory=lambda: get_default_sampler_value("max_tokens", 150),
default_factory=lambda: get_default_sampler_value("max_tokens"),
examples=[150],
)