Commit Graph

6 Commits

Author SHA1 Message Date
kingbri
751627e571 OAI: Add fasttensors to model load endpoint
Also fix logging when loading prompt templates.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 01:08:02 -05:00
kingbri
fc4570220c API + Model: Add new parameters and clean up documentation
The example JSON fields were changed because of the new sampler
default strategy. Fix these by manually changing the values.

Also add support for fasttensors and expose generate_window to
the API. It's recommended to not adjust generate_window as it's
dynamically scaled based on max_seq_len by default.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
90fb41a77a Model: Fix prompt template initialization
The previous commit iterated through multiple try conditions which
made it so the user has to provide a dummy prompt template. Now,
template loading is fallback based.

Run through a loop of functions and return if one of them succeeds.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
740b0215dd Model: Dynamically scale generate_window
Allows for adjustment of reservation space at the end of the context
before rolling it. This should be scaled as a model's max_seq_len
goes up.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
de0ba7214c API: Add template switching and unload endpoints
Templates can be switched and unloaded without reloading the entire
model.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
78f920eeda Tree: Refactor code organization
Move common functions into their own folder and refactor the backends
to use their own folder as well.

Also cleanup imports and alphabetize import statments themselves.

Finally, move colab and docker into their own folders as well.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00