Update README.md (#593)

This commit is contained in:
turboderp
2024-08-20 01:00:49 +02:00
committed by GitHub
parent f17feb8345
commit a72b73fc89

View File

@@ -2,6 +2,12 @@
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
The official and recommended backend server for ExLlamaV2 is [TabbyAPI](https://github.com/theroyallab/tabbyAPI/),
which provides an OpenAI-compatible API for local or remote inference, with extended features like HF model
downloading, embedding model support and support for HF Jinja2 chat templates.
See the [wiki](https://github.com/theroyallab/tabbyAPI/wiki/1.-Getting-Started) for help getting started.
## New in v0.1.0+: