tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-03-14 15:57:27 +00:00

Files

kingbri 078fbf1080 Model: Add quantized cache support for tensor parallel

Newer versions of exl2 v1.9-dev have quantized cache implemented. Add
those APIs.

Signed-off-by: kingbri <bdashore3@proton.me>

2024-08-22 14:15:19 -04:00

2024-08-22 14:15:19 -04:00

2024-07-30 15:32:26 -04:00