Documented in previous commits. Also make sure that for version checking,
check the value of kwargs instead of if the key is present since requests
pass default values.
Signed-off-by: kingbri <bdashore3@proton.me>
Stop conditions was None, causing model to error out when trying to
add the EOS token to a None value.
Authentication failed when Bearer contained an empty string. To fix
this, add a condition which checks array length.
Signed-off-by: kingbri <bdashore3@proton.me>
Responses were not being properly sent as JSON. Only run pydantic's
JSON function on stream responses. FastAPI does the rest with static
responses.
Signed-off-by: kingbri <bdashore3@proton.me>
Chat completions is the endpoint that will be used by OAI in the
future. Makes sense to support it even though the completions
endpoint will be used more often.
Also unify common parameters between the chat completion and completion
requests since they're very similar.
Signed-off-by: kingbri <bdashore3@proton.me>
Models can be loaded and unloaded via the API. Also add authentication
to use the API and for administrator tasks.
Both types of authorization use different keys.
Also fix the unload function to properly free all used vram.
Signed-off-by: kingbri <bdashore3@proton.me>