mirror of
https://github.com/theroyallab/tabbyAPI.git
synced 2026-04-21 06:48:56 +00:00
* Exposed draft model args for speculative decoding * Exposed int8 cache, dummy models, and no flash attention * Resolved CUDA 11.8 dependency issue
368 B
368 B