**This project is archived for now**. Development continues on [ExLlamaV3](https://github.com/turboderp-org/exllamav3).
# ExLlamaV2
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.