exllamav2

mirror of https://github.com/turboderp-org/exllamav2.git synced 2026-04-19 22:08:55 +00:00

Files

turboderp 4afe616aee Fix unhandled OoM condition when loading GPTQ model with auto split

Free minimum reserved VRAM on previous device when moving to next device

2023-10-28 20:08:39 +02:00

test_alloc.py

34B testing

2023-09-10 06:15:33 +02:00

test_autosplit.py

2023-10-28 20:08:39 +02:00

test_gemv.py

2023-10-22 20:23:42 +02:00

test_mmlu.py

2023-09-26 19:50:44 +02:00

test.py

Test script

2023-10-15 22:58:19 +02:00