API: Use FastAPI streaming instead of sse_starlette

sse_starlette kept firing a ping response if it was taking too long
to set an event. Rather than using a hacky workaround, switch to
FastAPI's inbuilt streaming response and construct SSE requests with
a utility function.

This helps the API become more robust and removes an extra requirement.

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri
2023-12-01 01:54:35 -05:00
parent 6493b1d2aa
commit ae69b18583
4 changed files with 15 additions and 12 deletions

View File

@@ -311,6 +311,7 @@ class ModelContainer:
stop_conditions: List[Union[str, int]] = kwargs.get("stop", [])
ban_eos_token = kwargs.get("ban_eos_token", False)
# Ban the EOS token if specified. If not, append to stop conditions as well.
if ban_eos_token: