turboderp
b311d0aca4
Remove sentencepiece dep from setup.py
2025-05-29 00:32:51 +02:00
turboderp
525b3204e0
Fix PIL dependency, skip version check in preprocessor
2024-11-10 10:31:21 +01:00
TerminalMan
d92ff8d9e4
improve installation experience ( #666 )
2024-11-02 21:11:14 +01:00
turboderp
b25210778c
Remove fasttensors, add platform-agnostic multithreaded ST loader
2024-09-17 00:33:16 +02:00
turboderp
65b9e17c4f
Add cross-device barrier kernel, remove event-based sync
2024-08-14 11:50:41 +02:00
turboderp
3f80e12496
Tensor P context and broadcast/gather functions
2024-08-07 14:46:18 +02:00
turboderp
28ff6180e0
Initial graph impl for QAttn and QMLP
2024-07-28 18:43:41 +02:00
turboderp
cba8f6c0d2
Add sources to setup.py
2024-07-06 18:15:07 +02:00
turboderp
9b725dd5cc
Add rich dependency to setup.py
2024-06-24 00:39:57 +02:00
turboderp
3fe6ca8010
Add C++ function for partial string matching in generator
2024-05-11 01:43:44 +02:00
turboderp
aef7bd125a
Add hadamard functions
2024-04-13 14:24:48 +02:00
turboderp
eb3fcfcc81
Add head norm module
2024-04-05 07:10:44 +02:00
turboderp
329087f96b
Fix Windows compile
2024-04-01 17:26:49 +02:00
turboderp
dd540ec13a
Rework AVX2 optimizations
2024-04-01 16:24:45 +02:00
turboderp
2f7546e9a2
Add new compiler flags to setup.py
2024-03-15 11:14:56 +01:00
turboderp
cedeb616ce
Support Qwen2
2024-02-15 20:50:24 +01:00
turboderp
80ed7a5222
Refactor extension
2024-02-10 12:05:34 +01:00
turboderp
7a9d12ae4c
Add non-RMS layernorm, support for Orion
2024-01-22 17:21:01 +01:00
turboderp
ed3067fee1
Fast safetensors load functions, experimental (not used yet)
2024-01-18 11:02:33 +01:00
turboderp
845260cff6
Fix paths in setup.py
2023-12-23 14:22:46 +01:00
turboderp
4eb05be36a
Split up compilation some more
2023-12-23 01:35:10 +01:00
turboderp
6d63f46a93
Use multiple compilation units for templated kernels to speed up build
2023-12-23 01:26:05 +01:00
turboderp
620a6ecfdb
Optimized/partially fused MoE MLP (GPTQ only for now)
2023-12-15 19:13:40 +01:00
kingbri
6bfcefe940
Tree: Force utf8 when opening files
...
The default encoding on linux is utf8, but Windows uses cp1252 which
isn't compatible with some unicode characters.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-11-29 19:21:29 -05:00
ardfork
a5e1c1a012
Simplify HIP compatibility
2023-11-10 13:14:41 +00:00
yuxiang
2766cceab7
The kv cache is forced to be an 8bit float, temporarily converted to half float when taken out for use.
2023-10-14 01:15:17 +08:00
turboderp
336b8689ce
Add regex requirement
2023-10-10 17:51:35 +02:00
turboderp
f9b45afc05
Faster GEMM kernels for tall/wide matrices
2023-10-07 19:21:56 +02:00
turboderp
0f9f5b6aab
Initial LoRA support (WIP)
2023-10-06 20:54:12 +02:00
turboderp
5d163ca4d5
Try using version.py again
2023-10-05 01:47:24 +02:00
turboderp
166c9fe7dc
Roll back changes to setup.py
2023-10-05 01:30:28 +02:00
turboderp
c2c4d15497
Fix version import in setup.py
2023-10-05 00:37:31 +02:00
turboderp
57661ff958
Add requirements: pygments, websockets
2023-10-01 12:50:45 +02:00
turboderp
51a0104bba
WebSocket server (WIP)
2023-09-30 23:52:11 +02:00
turboderp
4ce5b66dd4
Add select filter
2023-09-30 16:42:40 +02:00
turboderp
8466a765b9
Add version.py, fix warnings from setup.py re exllamav2/exllamav2_ext/*
2023-09-30 15:32:55 +02:00
turboderp
3ba989aee8
Bump to 0.0.4
2023-09-26 21:01:17 +02:00
jllllll
23a05b20e3
Allow pre-compiling CUDA extensions.
2023-09-25 11:47:12 -05:00
turboderp
21c44b79ce
Bump to 0.0.3
2023-09-21 21:51:23 +02:00
turboderp
c04a384a11
Bump to 0.0.2
2023-09-17 17:51:11 +02:00
turboderp
609e650b1f
Bump version
2023-09-13 04:27:08 +02:00
turboderp
b389b474eb
Add ninja requirement
2023-09-10 10:00:27 +02:00
turboderp
2617b6c012
Setuptools script
2023-09-10 09:02:05 +02:00