turboderp
cedeb616ce
Support Qwen2
2024-02-15 20:50:24 +01:00
turboderp
80ed7a5222
Refactor extension
2024-02-10 12:05:34 +01:00
turboderp
7a9d12ae4c
Add non-RMS layernorm, support for Orion
2024-01-22 17:21:01 +01:00
turboderp
ed3067fee1
Fast safetensors load functions, experimental (not used yet)
2024-01-18 11:02:33 +01:00
turboderp
845260cff6
Fix paths in setup.py
2023-12-23 14:22:46 +01:00
turboderp
4eb05be36a
Split up compilation some more
2023-12-23 01:35:10 +01:00
turboderp
6d63f46a93
Use multiple compilation units for templated kernels to speed up build
2023-12-23 01:26:05 +01:00
turboderp
620a6ecfdb
Optimized/partially fused MoE MLP (GPTQ only for now)
2023-12-15 19:13:40 +01:00
kingbri
6bfcefe940
Tree: Force utf8 when opening files
...
The default encoding on linux is utf8, but Windows uses cp1252 which
isn't compatible with some unicode characters.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-11-29 19:21:29 -05:00
ardfork
a5e1c1a012
Simplify HIP compatibility
2023-11-10 13:14:41 +00:00
yuxiang
2766cceab7
The kv cache is forced to be an 8bit float, temporarily converted to half float when taken out for use.
2023-10-14 01:15:17 +08:00
turboderp
336b8689ce
Add regex requirement
2023-10-10 17:51:35 +02:00
turboderp
f9b45afc05
Faster GEMM kernels for tall/wide matrices
2023-10-07 19:21:56 +02:00
turboderp
0f9f5b6aab
Initial LoRA support (WIP)
2023-10-06 20:54:12 +02:00
turboderp
5d163ca4d5
Try using version.py again
2023-10-05 01:47:24 +02:00
turboderp
166c9fe7dc
Roll back changes to setup.py
2023-10-05 01:30:28 +02:00
turboderp
c2c4d15497
Fix version import in setup.py
2023-10-05 00:37:31 +02:00
turboderp
57661ff958
Add requirements: pygments, websockets
2023-10-01 12:50:45 +02:00
turboderp
51a0104bba
WebSocket server (WIP)
2023-09-30 23:52:11 +02:00
turboderp
4ce5b66dd4
Add select filter
2023-09-30 16:42:40 +02:00
turboderp
8466a765b9
Add version.py, fix warnings from setup.py re exllamav2/exllamav2_ext/*
2023-09-30 15:32:55 +02:00
turboderp
3ba989aee8
Bump to 0.0.4
2023-09-26 21:01:17 +02:00
jllllll
23a05b20e3
Allow pre-compiling CUDA extensions.
2023-09-25 11:47:12 -05:00
turboderp
21c44b79ce
Bump to 0.0.3
2023-09-21 21:51:23 +02:00
turboderp
c04a384a11
Bump to 0.0.2
2023-09-17 17:51:11 +02:00
turboderp
609e650b1f
Bump version
2023-09-13 04:27:08 +02:00
turboderp
b389b474eb
Add ninja requirement
2023-09-10 10:00:27 +02:00
turboderp
2617b6c012
Setuptools script
2023-09-10 09:02:05 +02:00