Commit Graph

28 Commits

Author SHA1 Message Date
turboderp
cedeb616ce Support Qwen2 2024-02-15 20:50:24 +01:00
turboderp
80ed7a5222 Refactor extension 2024-02-10 12:05:34 +01:00
turboderp
7a9d12ae4c Add non-RMS layernorm, support for Orion 2024-01-22 17:21:01 +01:00
turboderp
ed3067fee1 Fast safetensors load functions, experimental (not used yet) 2024-01-18 11:02:33 +01:00
turboderp
845260cff6 Fix paths in setup.py 2023-12-23 14:22:46 +01:00
turboderp
4eb05be36a Split up compilation some more 2023-12-23 01:35:10 +01:00
turboderp
6d63f46a93 Use multiple compilation units for templated kernels to speed up build 2023-12-23 01:26:05 +01:00
turboderp
620a6ecfdb Optimized/partially fused MoE MLP (GPTQ only for now) 2023-12-15 19:13:40 +01:00
kingbri
6bfcefe940 Tree: Force utf8 when opening files
The default encoding on linux is utf8, but Windows uses cp1252 which
isn't compatible with some unicode characters.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-29 19:21:29 -05:00
ardfork
a5e1c1a012 Simplify HIP compatibility 2023-11-10 13:14:41 +00:00
yuxiang
2766cceab7 The kv cache is forced to be an 8bit float, temporarily converted to half float when taken out for use. 2023-10-14 01:15:17 +08:00
turboderp
336b8689ce Add regex requirement 2023-10-10 17:51:35 +02:00
turboderp
f9b45afc05 Faster GEMM kernels for tall/wide matrices 2023-10-07 19:21:56 +02:00
turboderp
0f9f5b6aab Initial LoRA support (WIP) 2023-10-06 20:54:12 +02:00
turboderp
5d163ca4d5 Try using version.py again 2023-10-05 01:47:24 +02:00
turboderp
166c9fe7dc Roll back changes to setup.py 2023-10-05 01:30:28 +02:00
turboderp
c2c4d15497 Fix version import in setup.py 2023-10-05 00:37:31 +02:00
turboderp
57661ff958 Add requirements: pygments, websockets 2023-10-01 12:50:45 +02:00
turboderp
51a0104bba WebSocket server (WIP) 2023-09-30 23:52:11 +02:00
turboderp
4ce5b66dd4 Add select filter 2023-09-30 16:42:40 +02:00
turboderp
8466a765b9 Add version.py, fix warnings from setup.py re exllamav2/exllamav2_ext/* 2023-09-30 15:32:55 +02:00
turboderp
3ba989aee8 Bump to 0.0.4 2023-09-26 21:01:17 +02:00
jllllll
23a05b20e3 Allow pre-compiling CUDA extensions. 2023-09-25 11:47:12 -05:00
turboderp
21c44b79ce Bump to 0.0.3 2023-09-21 21:51:23 +02:00
turboderp
c04a384a11 Bump to 0.0.2 2023-09-17 17:51:11 +02:00
turboderp
609e650b1f Bump version 2023-09-13 04:27:08 +02:00
turboderp
b389b474eb Add ninja requirement 2023-09-10 10:00:27 +02:00
turboderp
2617b6c012 Setuptools script 2023-09-10 09:02:05 +02:00