Add the ability to use an unsafe config flag if needed and migrate
the exl2 check to a different file within the exl2 backend code.
Signed-off-by: kingbri <bdashore3@proton.me>
Cleanup how overrides are handled, class naming, and adopt exllamav2's
model class to enforce latest stable version methods rather than
adding multiple backwards compatability checks.
Signed-off-by: kingbri <bdashore3@proton.me>
Does not work if max_temp is less than or equal to min_temp. Sampler
validation will have to be refactored in the future, so the dynamic
temperature check will also be changed.
Signed-off-by: kingbri <bdashore3@proton.me>
The example JSON fields were changed because of the new sampler
default strategy. Fix these by manually changing the values.
Also add support for fasttensors and expose generate_window to
the API. It's recommended to not adjust generate_window as it's
dynamically scaled based on max_seq_len by default.
Signed-off-by: kingbri <bdashore3@proton.me>
The previous commit iterated through multiple try conditions which
made it so the user has to provide a dummy prompt template. Now,
template loading is fallback based.
Run through a loop of functions and return if one of them succeeds.
Signed-off-by: kingbri <bdashore3@proton.me>
Allows for adjustment of reservation space at the end of the context
before rolling it. This should be scaled as a model's max_seq_len
goes up.
Signed-off-by: kingbri <bdashore3@proton.me>
Move common functions into their own folder and refactor the backends
to use their own folder as well.
Also cleanup imports and alphabetize import statments themselves.
Finally, move colab and docker into their own folders as well.
Signed-off-by: kingbri <bdashore3@proton.me>