turboderp
|
00ab0084c3
|
Basic LoRA support for MoE/Mixtral. Working but pretty slow for now
|
2023-12-24 01:41:23 +01:00 |
|
turboderp
|
b6b54dab00
|
Attempt to fix VC++ weirdness
|
2023-12-24 01:09:53 +01:00 |
|
turboderp
|
87225fe0c1
|
Optimize kernel batch performance
|
2023-12-23 22:05:41 +01:00 |
|
turboderp
|
7262fb8f9d
|
Batch latency test script
|
2023-12-23 22:04:40 +01:00 |
|
turboderp
|
bf2710f008
|
Optimizer batched sampling
|
2023-12-23 22:04:10 +01:00 |
|
turboderp
|
845260cff6
|
Fix paths in setup.py
|
2023-12-23 14:22:46 +01:00 |
|
turboderp
|
be1e48dc4d
|
Fix bug when applying offsets to position embeddings
|
2023-12-23 04:58:24 +01:00 |
|
turboderp
|
c284648dbe
|
Add script to compare quantized and unquantized model
|
2023-12-23 02:57:13 +01:00 |
|
AlpinDale
|
a531dea6a0
|
Merge branch 'turboderp:master' into feat/frequency_presence_pen
|
2023-12-23 01:42:00 +00:00 |
|
turboderp
|
0b0afab9bd
|
Merge pull request #239 from AlpinDale/master
feat: add top-A sampling
|
2023-12-23 02:33:41 +01:00 |
|
turboderp
|
b10e53822f
|
Fix comment
|
2023-12-23 02:33:10 +01:00 |
|
turboderp
|
4eb05be36a
|
Split up compilation some more
|
2023-12-23 01:35:10 +01:00 |
|
turboderp
|
6d63f46a93
|
Use multiple compilation units for templated kernels to speed up build
|
2023-12-23 01:26:05 +01:00 |
|
turboderp
|
e922f7e295
|
Fix console output in model_init
|
2023-12-23 01:23:34 +01:00 |
|
turboderp
|
c7de810313
|
Bit of cleanup
|
2023-12-23 01:23:04 +01:00 |
|
AlpinDale
|
1384eb540a
|
add frequency and presence penalties
|
2023-12-21 17:19:47 +00:00 |
|
AlpinDale
|
ef81354232
|
add top_a to sample_basic
|
2023-12-21 16:11:42 +00:00 |
|
AlpinDale
|
8b22c0f2d1
|
do probs sorting
|
2023-12-21 16:06:19 +00:00 |
|
AlpinDale
|
a58176af64
|
just use temp_probs
|
2023-12-21 15:44:34 +00:00 |
|
AlpinDale
|
5131099b5f
|
add top_a in a few more places
|
2023-12-21 15:28:34 +00:00 |
|
AlpinDale
|
f55bece3d3
|
top_a return type is float
|
2023-12-21 15:23:52 +00:00 |
|
AlpinDale
|
50856f1f2b
|
Revert "remove algorithm include"
This reverts commit 1a20a02cc3.
|
2023-12-21 15:14:32 +00:00 |
|
AlpinDale
|
1a20a02cc3
|
remove algorithm include
|
2023-12-21 14:55:49 +00:00 |
|
AlpinDale
|
638af33e89
|
add top_a sampling
|
2023-12-21 14:54:47 +00:00 |
|
turboderp
|
c4ae226df5
|
HumanEval test
|
2023-12-21 12:29:09 +01:00 |
|
Ivan Sanchez
|
41efa463cd
|
unpack prob from return of generator.stream()
|
2023-12-21 10:54:52 +00:00 |
|
turboderp
|
9009ba5cd2
|
Fix sampling for bsz > 1
|
2023-12-20 18:14:23 +01:00 |
|
Ivan Sanchez
|
b908544845
|
Add probabilities to streaming generator
|
2023-12-20 10:01:11 +00:00 |
|
turboderp
|
162fc5d62c
|
model_init (and test_inference.py): add option to override no. experts per token from config.json
|
2023-12-19 16:38:36 +01:00 |
|
turboderp
|
8f40b5f92d
|
Merge remote-tracking branch 'origin/master'
|
2023-12-19 00:14:51 +01:00 |
|
turboderp
|
5a61d6e821
|
Merge pull request #137 from deltaguo/master
Fix the garbadge output for ROCM
|
2023-12-18 14:04:36 +01:00 |
|
turboderp
|
9c81167e4e
|
Update rms_norm.cu
use warpSize provided by hip
|
2023-12-18 14:04:15 +01:00 |
|
turboderp
|
fb1a20fbfd
|
Merge pull request #231 from dvdtoth/master
Fix encoder in MMLU benchmark
|
2023-12-18 14:01:05 +01:00 |
|
turboderp
|
93bca57cc4
|
Free all VRAM when unloading quantized module
|
2023-12-18 01:30:21 +01:00 |
|
David Toth
|
ed7a104e71
|
Fix encoder in MMLU
|
2023-12-17 21:08:27 +00:00 |
|
turboderp
|
d1f2952cd6
|
Fix multiple caches not working with 8-bit cache mode
|
2023-12-17 14:41:21 +01:00 |
|
turboderp
|
e6bb29f06b
|
Fix cache clone function
|
2023-12-17 13:54:41 +01:00 |
|
turboderp
|
a77a051025
|
Enable fused MoE kernels for num_experts = 4
|
2023-12-17 11:49:15 +01:00 |
|
turboderp
|
b121ee418f
|
Fix typo
|
2023-12-17 10:40:11 +01:00 |
|
turboderp
|
a4ecea6d57
|
Bump to 0.0.11
v0.0.11
|
2023-12-16 23:48:53 +01:00 |
|
turboderp
|
3c6ee1bb61
|
Merge pull request #228 from turboderp/experimental
Merge experimental
|
2023-12-16 23:36:15 +01:00 |
|
turboderp
|
79eb742bcf
|
Update README.md
|
2023-12-16 22:06:44 +01:00 |
|
turboderp
|
89587d13df
|
Update convert.py instructions
|
2023-12-16 22:03:25 +01:00 |
|
turboderp
|
02e2cb4d4a
|
Update convert.py instructions
|
2023-12-16 21:51:35 +01:00 |
|
turboderp
|
660ce041cf
|
Merge remote-tracking branch 'origin/experimental' into experimental
|
2023-12-16 21:50:48 +01:00 |
|
turboderp
|
d979249790
|
Merge pull request #227 from turboderp/master
Merge changes from master
|
2023-12-16 21:39:09 +01:00 |
|
turboderp
|
8a19badb01
|
Fix bug in standard cal dataset
|
2023-12-16 20:30:40 +01:00 |
|
turboderp
|
37a1322096
|
Fix mistake in MLP measure
|
2023-12-16 20:30:25 +01:00 |
|
turboderp
|
d2753a29b8
|
Mixtral EXL2 support, initial
|
2023-12-16 16:50:50 +01:00 |
|
turboderp
|
371f875aef
|
Un-hardcode number of experts per token
|
2023-12-15 21:07:27 +01:00 |
|