turboderp
|
60eedf4622
|
Add exit status code for quant error
|
2024-06-13 20:43:49 +02:00 |
|
turboderp
|
83baa98ed9
|
Add machine-parseable output to convert script
|
2024-05-20 01:49:34 +02:00 |
|
turboderp
|
750c85e2c7
|
Fixes to allow quantizing Granite
|
2024-05-09 02:31:21 +02:00 |
|
turboderp
|
52bc008df9
|
Don't add metadata when -cf not specified
|
2024-04-06 17:28:57 +02:00 |
|
turboderp
|
97e8123c71
|
Enable head (qk) norms for quantized models
|
2024-04-05 21:35:23 +02:00 |
|
turboderp
|
9c47269913
|
Add parallel decoder block
|
2024-03-19 18:20:44 +01:00 |
|
turboderp
|
5fb2c679cb
|
Add quantization_config to config.json when compiling
|
2024-03-12 09:09:30 +01:00 |
|
turboderp
|
0b05686e76
|
Refactor, clean up and consolidate architecture logic
|
2024-03-06 02:46:47 +01:00 |
|
turboderp
|
dce84866e1
|
Support for StarCoder2, initial
|
2024-03-05 21:20:29 +01:00 |
|
turboderp
|
2044f8a31c
|
Set inference_mode when compiling model
|
2024-02-22 10:48:44 +01:00 |
|
turboderp
|
0e9d9c1010
|
Prevent tensors passed to save_file from sharing memory
|
2024-02-01 10:14:36 +01:00 |
|
turboderp
|
2707e28165
|
Skip .bin files when compiling full model
|
2024-01-22 17:34:24 +01:00 |
|
turboderp
|
7a9d12ae4c
|
Add non-RMS layernorm, support for Orion
|
2024-01-22 17:21:01 +01:00 |
|
turboderp
|
1f71d17b89
|
Use .union() for Python 3.8 compatibility
|
2024-01-20 06:22:14 +01:00 |
|
turboderp
|
d2753a29b8
|
Mixtral EXL2 support, initial
|
2023-12-16 16:50:50 +01:00 |
|
turboderp
|
2b0da96de7
|
Fix edge case if last layer doesn't fit in last shard
|
2023-09-23 21:23:23 +02:00 |
|
turboderp
|
2a3ff14af2
|
Remove repeated console output
|
2023-09-20 09:54:43 +02:00 |
|
turboderp
|
6fd006b9d0
|
More options for converter to facilitate scripting
|
2023-09-18 18:25:30 +02:00 |
|
turboderp
|
af1398ff16
|
Conversion: ability to save sharded models (addresses OoM when compiling output file)
|
2023-09-16 11:44:07 +02:00 |
|
turboderp
|
bb83469574
|
Initial commit
|
2023-08-30 11:05:23 +02:00 |
|