Link to model collection

2026-03-15 00:07:24 +00:00 · 2025-04-06 22:13:32 +02:00
parent 95b4da216a
commit f579de4923
1 changed files with 1 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -97,7 +97,7 @@ There are some benchmark results [here](doc/exl3.md), and a full writeup on the

 Fun fact: Llama-3.1-70B-EXL3 is coherent at 1.6 bpw. With the output layer quantized to 3 bpw and a 4096-token cache, inference is possible in under 16 GB of VRAM. 

-A selection of EXL3-quantized models is available on [🤗 Hugging Face](https://huggingface.co/turboderp).
+A selection of EXL3-quantized models is available on [🤗 Hugging Face](https://huggingface.co/collections/turboderp/exl3-models-67f2dfe530f05cb9f596d21a).


 ## Acknowledgements