Commit Graph

97 Commits

Author SHA1 Message Date
Stephan Walter
4aeea780bc Remove Q4_3 which is no better than Q5 (#1218) 2023-04-28 23:10:43 +00:00
Georgi Gerganov
3a8a3891b3 readme : update hot topics 2023-04-28 21:32:52 +03:00
Folko-Ven
afe6563a58 Correcting link to w64devkit (#1214)
Correcting link to w64devkit (change seeto to skeeto).
2023-04-28 16:22:48 +02:00
Georgi Gerganov
f9318ab76d readme : add quantization info 2023-04-26 23:24:42 +03:00
DaniAndTheWeb
8ad378c494 Updating build instructions to include BLAS support (#1183)
* Updated build information

First update to the build instructions to include BLAS.

* Update README.md

* Update information about BLAS

* Better BLAS explanation

Adding a clearer BLAS explanation and adding a link to download the CUDA toolkit.

* Better BLAS explanation

* BLAS for Mac

Specifying that BLAS is already supported on Macs using the Accelerate Framework.

* Clarify the effect of BLAS

* Windows Make instructions

Added the instructions to build with Make on Windows

* Fixing typo

* Fix trailing whitespace
2023-04-26 22:03:03 +02:00
Pavol Rusnak
9e63abecf7 quantize : use map to assign quantization type from string (#1191)
instead of `int` (while `int` option still being supported)

This allows the following usage:

`./quantize ggml-model-f16.bin ggml-model-q4_0.bin q4_0`

instead of:

`./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2`
2023-04-26 18:43:27 +02:00
mgroeber9110
094250bc21 examples/main README improvements and some light refactoring (#1131) 2023-04-24 15:45:32 +00:00
Pavol Rusnak
0ba63e81ff readme : update gpt4all instructions (#980) 2023-04-23 10:21:26 +02:00
CRD716
7ecc2d9e42 Minor: Readme fixed grammar, spelling, and misc updates (#1071) 2023-04-19 19:52:14 +00:00
Georgi Gerganov
068083ca76 readme : add warning about Q4_2 and Q4_3 2023-04-19 19:07:54 +03:00
Georgi Gerganov
426d0c45f4 readme : update hot topics about new LoRA functionality 2023-04-18 20:10:26 +03:00
Atsushi Tatsuma
3c9a24cc72 readme : add Ruby bindings (#1029) 2023-04-17 22:34:35 +03:00
comex
3573ed90b8 py : new conversion script (#545)
Current status: Working, except for the latest GPTQ-for-LLaMa format
  that includes `g_idx`.  This turns out to require changes to GGML, so
  for now it only works if you use the `--outtype` option to dequantize it
  back to f16 (which is pointless except for debugging).

  I also included some cleanup for the C++ code.

  This script is meant to replace all the existing conversion scripts
  (including the ones that convert from older GGML formats), while also
  adding support for some new formats.  Specifically, I've tested with:

  - [x] `LLaMA` (original)
  - [x] `llama-65b-4bit`
  - [x] `alpaca-native`
  - [x] `alpaca-native-4bit`
  - [x] LLaMA converted to 'transformers' format using
        `convert_llama_weights_to_hf.py`
  - [x] `alpaca-native` quantized with `--true-sequential --act-order
        --groupsize 128` (dequantized only)
  - [x] same as above plus `--save_safetensors`
  - [x] GPT4All
  - [x] stock unversioned ggml
  - [x] ggmh

  There's enough overlap in the logic needed to handle these different
  cases that it seemed best to move to a single script.

  I haven't tried this with Alpaca-LoRA because I don't know where to find
  it.

  Useful features:

  - Uses multiple threads for a speedup in some cases (though the Python
    GIL limits the gain, and sometimes it's disk-bound anyway).

  - Combines split models into a single file (both the intra-tensor split
    of the original and the inter-tensor split of 'transformers' format
    files).  Single files are more convenient to work with and more
    friendly to future changes to use memory mapping on the C++ side.  To
    accomplish this without increasing memory requirements, it has some
    custom loading code which avoids loading whole input files into memory
    at once.

  - Because of the custom loading code, it no longer depends in PyTorch,
    which might make installing dependencies slightly easier or faster...
    although it still depends on NumPy and sentencepiece, so I don't know
    if there's any meaningful difference.  In any case, I also added a
    requirements.txt file to lock the dependency versions in case of any
    future breaking changes.

  - Type annotations checked with mypy.

  - Some attempts to be extra user-friendly:

      - The script tries to be forgiving with arguments, e.g. you can
        specify either the model file itself or the directory containing
        it.

      - The script doesn't depend on config.json / params.json, just in
        case the user downloaded files individually and doesn't have those
        handy.  But you still need tokenizer.model and, for Alpaca,
        added_tokens.json.

      - The script tries to give a helpful error message if
        added_tokens.json is missing.
2023-04-14 10:03:03 +03:00
CRD716
d74ce11c25 readme : remove python 3.10 warning (#929) 2023-04-13 16:59:53 +03:00
Genkagaku.GPT
c720fa4877 readme : llama node binding (#911)
* chore: add nodejs binding

* chore: add nodejs binding
2023-04-13 16:54:27 +03:00
Judd
b9a4538eaa zig : update build.zig (#872)
* update

* update readme

* minimize the changes.

---------

Co-authored-by: zjli2019 <zhengji.li@ingchips.com>
2023-04-13 16:43:22 +03:00
Georgi Gerganov
be7082caef readme : change "GPU support" link to discussion 2023-04-12 14:48:57 +03:00
Georgi Gerganov
9b68f0ee36 readme : update hot topics with link to "GPU support" issue 2023-04-12 14:31:12 +03:00
Nicolai Weitkemper
e610019c01 readme: link to sha256sums file (#902)
This is to emphasize that these do not need to be obtained from elsewhere.
2023-04-12 08:46:20 +02:00
Pavol Rusnak
e4d3b4b251 Fix whitespace, add .editorconfig, add GitHub workflow (#883) 2023-04-11 19:45:44 +00:00
qouoq
4b0adc70d7 Add BAIR's Koala to supported models (#877) 2023-04-10 22:41:53 +02:00
Pavol Rusnak
944d161986 Make docker instructions more explicit (#785) 2023-04-06 08:56:58 +02:00
Georgi Gerganov
0470adafd0 Update README.md 2023-04-05 19:54:30 +03:00
Georgi Gerganov
a19b5cee08 readme : change logo + add bindings + add uis + add wiki 2023-04-05 18:56:20 +03:00
Adithya Balaji
5cdd9ef43f readme : update with CMake and windows example (#748)
* README: Update with CMake and windows example

* README: update with code-review for cmake build
2023-04-05 17:36:12 +03:00
Thatcher Chamberlin
01e2261e5f Add a missing step to the gpt4all instructions (#690)
`migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.
2023-04-02 12:48:57 +02:00
rimoliga
34977d15c2 readme: replace termux links with homepage, play store is deprecated (#680) 2023-04-01 16:57:30 +02:00
Pavol Rusnak
e88a8002b5 drop quantize.py (now that models are using a single file) 2023-03-31 01:07:32 +02:00
Georgi Gerganov
e19e304480 readme : update supported models 2023-03-30 22:31:54 +03:00
Georgi Gerganov
32d84d4876 readme : fix typos 2023-03-29 19:38:31 +03:00
Georgi Gerganov
689ed6a51e readme : add GPT4All instructions (close #588) 2023-03-29 19:37:20 +03:00
Stephan Walter
3b8b2c584a Update README and comments for standalone perplexity tool (#525) 2023-03-26 16:14:01 +03:00
Georgi Gerganov
3600f1d140 Add logo to README.md 2023-03-26 10:20:49 +03:00
Georgi Gerganov
9d678e17dc Move chat scripts into "./examples" 2023-03-25 20:37:09 +02:00
Georgi Gerganov
1c1459f073 Remove obsolete information from README 2023-03-25 16:30:32 +02:00
Gary Mulder
ccf5a1b08d Update README.md (#444)
Added explicit **bolded** instructions clarifying that people need to request access to models from Facebook and never through through this repo.
2023-03-24 15:23:09 +00:00
Georgi Gerganov
1f369c619d Add link to Roadmap discussion 2023-03-24 09:13:35 +02:00
Stephan Walter
3ebb023fb2 Revert "Delete SHA256SUMS for now" (#429)
* Revert "Delete SHA256SUMS for now (#416)"

This reverts commit 8eea5ae0e5.

* Remove ggml files until they can be verified
* Remove alpaca json
* Add also model/tokenizer.model to SHA256SUMS + update README

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-23 15:15:48 +01:00
Gary Mulder
e689dccbad Move model section from issue template to README.md (#421)
* Update custom.md

* Removed Model section as it is better placed in README.md

* Updates to README.md model section

* Inserted text that was removed from  issue template about obtaining models from FB and links to papers describing the various models

* Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway

* Updated the perplexity section to point at Perplexity scores #406 discussion
2023-03-23 11:30:40 +00:00
Georgi Gerganov
a1b7fa8c60 Adjust repetition penalty .. 2023-03-23 10:46:58 +02:00
Georgi Gerganov
1d31d737d8 Add link to recent podcast about whisper.cpp and llama.cpp 2023-03-23 09:48:51 +02:00
Gary Linscott
686427a35f Add details on perplexity to README.md (#395) 2023-03-22 08:53:54 -07:00
Georgi Gerganov
1deed1f1e7 Remove temporary notice and update hot topics 2023-03-22 07:34:02 +02:00
Gary Mulder
3081cf8ed9 Add SHA256SUMS file and instructions to README how to obtain and verify the downloads
Hashes created using:

sha256sum models/*B/*.pth models/*[7136]B/ggml-model-f16.bin* models/*[7136]B/ggml-model-q4_0.bin* > SHA256SUMS
2023-03-21 23:19:11 +01:00
Georgi Gerganov
278d9b3d84 Add notice about pending change 2023-03-21 22:57:35 +02:00
Georgi Gerganov
e7a75316dc Minor style changes 2023-03-21 18:10:32 +02:00
Georgi Gerganov
f57b30a8e1 Add chat.sh script 2023-03-21 18:09:46 +02:00
Georgi Gerganov
614b1afa1c Fix convert script, warnings alpaca instructions, default params 2023-03-21 17:59:16 +02:00
Kevin Kwok
9c26af616b Update IPFS links to quantized alpaca with new tokenizer format (#352) 2023-03-21 17:34:49 +02:00
Mack Straight
60d93896be sentencepiece bpe compatible tokenizer (#252)
* potential out of bounds read

* fix quantize

* style

* Update convert-pth-to-ggml.py

* mild cleanup

* don't need the space-prefixing here rn since main.cpp already does it

* new file magic + version header field

* readme notice

* missing newlines

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
2023-03-20 03:17:23 -07:00