Update dependencies, support Python 3.12, update for exl2 0.1.5 (#134)

* Dependencies: Add wheels for Python 3.12

* Model: Switch fp8 cache to Q8 cache

* Model: Add ability to set draft model cache mode

* Dependencies: Bump exllamav2 to 0.1.5

* Model: Support Q6 cache

* Config: Add Q6 cache and draft_cache_mode to config sample
This commit is contained in:
DocShotgun
2024-06-09 08:27:39 -07:00
committed by GitHub
parent dcd9428325
commit 55d979b7a5
5 changed files with 84 additions and 33 deletions

View File

@@ -6,7 +6,7 @@ from loguru import logger
def check_exllama_version():
"""Verifies the exllama version"""
required_version = version.parse("0.1.4")
required_version = version.parse("0.1.5")
current_version = version.parse(package_version("exllamav2").split("+")[0])
if current_version < required_version: