43 Commits

Author SHA1 Message Date
turboderp
b311d0aca4 Remove sentencepiece dep from setup.py 2025-05-29 00:32:51 +02:00
turboderp
525b3204e0 Fix PIL dependency, skip version check in preprocessor 2024-11-10 10:31:21 +01:00
TerminalMan
d92ff8d9e4 improve installation experience (#666) 2024-11-02 21:11:14 +01:00
turboderp
b25210778c Remove fasttensors, add platform-agnostic multithreaded ST loader 2024-09-17 00:33:16 +02:00
turboderp
65b9e17c4f Add cross-device barrier kernel, remove event-based sync 2024-08-14 11:50:41 +02:00
turboderp
3f80e12496 Tensor P context and broadcast/gather functions 2024-08-07 14:46:18 +02:00
turboderp
28ff6180e0 Initial graph impl for QAttn and QMLP 2024-07-28 18:43:41 +02:00
turboderp
cba8f6c0d2 Add sources to setup.py 2024-07-06 18:15:07 +02:00
turboderp
9b725dd5cc Add rich dependency to setup.py 2024-06-24 00:39:57 +02:00
turboderp
3fe6ca8010 Add C++ function for partial string matching in generator 2024-05-11 01:43:44 +02:00
turboderp
aef7bd125a Add hadamard functions 2024-04-13 14:24:48 +02:00
turboderp
eb3fcfcc81 Add head norm module 2024-04-05 07:10:44 +02:00
turboderp
329087f96b Fix Windows compile 2024-04-01 17:26:49 +02:00
turboderp
dd540ec13a Rework AVX2 optimizations 2024-04-01 16:24:45 +02:00
turboderp
2f7546e9a2 Add new compiler flags to setup.py 2024-03-15 11:14:56 +01:00
turboderp
cedeb616ce Support Qwen2 2024-02-15 20:50:24 +01:00
turboderp
80ed7a5222 Refactor extension 2024-02-10 12:05:34 +01:00
turboderp
7a9d12ae4c Add non-RMS layernorm, support for Orion 2024-01-22 17:21:01 +01:00
turboderp
ed3067fee1 Fast safetensors load functions, experimental (not used yet) 2024-01-18 11:02:33 +01:00
turboderp
845260cff6 Fix paths in setup.py 2023-12-23 14:22:46 +01:00
turboderp
4eb05be36a Split up compilation some more 2023-12-23 01:35:10 +01:00
turboderp
6d63f46a93 Use multiple compilation units for templated kernels to speed up build 2023-12-23 01:26:05 +01:00
turboderp
620a6ecfdb Optimized/partially fused MoE MLP (GPTQ only for now) 2023-12-15 19:13:40 +01:00
kingbri
6bfcefe940 Tree: Force utf8 when opening files
The default encoding on linux is utf8, but Windows uses cp1252 which
isn't compatible with some unicode characters.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-29 19:21:29 -05:00
ardfork
a5e1c1a012 Simplify HIP compatibility 2023-11-10 13:14:41 +00:00
yuxiang
2766cceab7 The kv cache is forced to be an 8bit float, temporarily converted to half float when taken out for use. 2023-10-14 01:15:17 +08:00
turboderp
336b8689ce Add regex requirement 2023-10-10 17:51:35 +02:00
turboderp
f9b45afc05 Faster GEMM kernels for tall/wide matrices 2023-10-07 19:21:56 +02:00
turboderp
0f9f5b6aab Initial LoRA support (WIP) 2023-10-06 20:54:12 +02:00
turboderp
5d163ca4d5 Try using version.py again 2023-10-05 01:47:24 +02:00
turboderp
166c9fe7dc Roll back changes to setup.py 2023-10-05 01:30:28 +02:00
turboderp
c2c4d15497 Fix version import in setup.py 2023-10-05 00:37:31 +02:00
turboderp
57661ff958 Add requirements: pygments, websockets 2023-10-01 12:50:45 +02:00
turboderp
51a0104bba WebSocket server (WIP) 2023-09-30 23:52:11 +02:00
turboderp
4ce5b66dd4 Add select filter 2023-09-30 16:42:40 +02:00
turboderp
8466a765b9 Add version.py, fix warnings from setup.py re exllamav2/exllamav2_ext/* 2023-09-30 15:32:55 +02:00
turboderp
3ba989aee8 Bump to 0.0.4 2023-09-26 21:01:17 +02:00
jllllll
23a05b20e3 Allow pre-compiling CUDA extensions. 2023-09-25 11:47:12 -05:00
turboderp
21c44b79ce Bump to 0.0.3 2023-09-21 21:51:23 +02:00
turboderp
c04a384a11 Bump to 0.0.2 2023-09-17 17:51:11 +02:00
turboderp
609e650b1f Bump version 2023-09-13 04:27:08 +02:00
turboderp
b389b474eb Add ninja requirement 2023-09-10 10:00:27 +02:00
turboderp
2617b6c012 Setuptools script 2023-09-10 09:02:05 +02:00