From a72b73fc898632cd450c2332eb277500cd847681 Mon Sep 17 00:00:00 2001 From: turboderp <11859846+turboderp@users.noreply.github.com> Date: Tue, 20 Aug 2024 01:00:49 +0200 Subject: [PATCH] Update README.md (#593) --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index 4f640f6..cc93cc2 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,12 @@ ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs. +The official and recommended backend server for ExLlamaV2 is [TabbyAPI](https://github.com/theroyallab/tabbyAPI/), +which provides an OpenAI-compatible API for local or remote inference, with extended features like HF model +downloading, embedding model support and support for HF Jinja2 chat templates. + +See the [wiki](https://github.com/theroyallab/tabbyAPI/wiki/1.-Getting-Started) for help getting started. + ## New in v0.1.0+: