mirror of
https://github.com/theroyallab/tabbyAPI.git
synced 2026-03-15 00:07:28 +00:00
Jinja2 is a lightweight template parser that's used in Transformers for parsing chat completions. It's much more efficient than Fastchat and can be imported as part of requirements. Also allows for unblocking Pydantic's version. Users now have to provide their own template if needed. A separate repo may be usable for common prompt template storage. Signed-off-by: kingbri <bdashore3@proton.me>
32 lines
1.9 KiB
Plaintext
32 lines
1.9 KiB
Plaintext
# Torch
|
|
--extra-index-url https://download.pytorch.org/whl/cu121
|
|
torch
|
|
|
|
# Exllamav2
|
|
|
|
# Windows
|
|
https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
|
|
https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
|
|
|
|
# Linux
|
|
https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
|
|
https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
|
|
|
|
# Pip dependencies
|
|
fastapi
|
|
pydantic < 2,>= 1
|
|
PyYAML
|
|
progress
|
|
uvicorn
|
|
jinja2
|
|
|
|
# Flash attention v2
|
|
|
|
# Windows FA2 from https://github.com/jllllll/flash-attention/releases
|
|
https://github.com/jllllll/flash-attention/releases/download/v2.3.6/flash_attn-2.3.6+cu121torch2.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
|
|
https://github.com/jllllll/flash-attention/releases/download/v2.3.6/flash_attn-2.3.6+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
|
|
|
|
# Linux FA2 from https://github.com/Dao-AILab/flash-attention/releases
|
|
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.6/flash_attn-2.3.6+cu122torch2.1cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
|
|
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.6/flash_attn-2.3.6+cu122torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
|