Commit Graph

42 Commits

Author SHA1 Message Date
Olivier Chafik
b267b997c5 build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew

* server: update refs -> llama-server

gitignore llama-server

* server: simplify nix package

* main: update refs -> llama

fix examples/main ref

* main/server: fix targets

* update more names

* Update build.yml

* rm accidentally checked in bins

* update straggling refs

* Update .gitignore

* Update server-llm.sh

* main: target name -> llama-cli

* Prefix all example bins w/ llama-

* fix main refs

* rename {main->llama}-cmake-pkg binary

* prefix more cmake targets w/ llama-

* add/fix gbnf-validator subfolder to cmake

* sort cmake example subdirs

* rm bin files

* fix llama-lookup-* Makefile rules

* gitignore /llama-*

* rename Dockerfiles

* rename llama|main -> llama-cli; consistent RPM bin prefixes

* fix some missing -cli suffixes

* rename dockerfile w/ llama-cli

* rename(make): llama-baby-llama

* update dockerfile refs

* more llama-cli(.exe)

* fix test-eval-callback

* rename: llama-cli-cmake-pkg(.exe)

* address gbnf-validator unused fread warning (switched to C++ / ifstream)

* add two missing llama- prefixes

* Updating docs for eval-callback binary to use new `llama-` prefix.

* Updating a few lingering doc references for rename of main to llama-cli

* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.

* Updating documentation references for lookup-merge and export-lora

* Updating two small `main` references missed earlier in the finetune docs.

* Update apps.nix

* update grammar/README.md w/ new llama-* names

* update llama-rpc-server bin name + doc

* Revert "update llama-rpc-server bin name + doc"

This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930.

* add hot topic notice to README.md

* Update README.md

* Update README.md

* rename gguf-split & quantize bins refs in **/tests.sh

---------

Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
Georgi Gerganov
8de006f83e ggml : remove OpenCL (#7735)
ggml-ci
2024-06-04 21:23:20 +03:00
Someone Serge
65d6316f65 nix: .#windows: proper cross-compilation set-up
Take all dependencies from the cross stage, rather tha only stdenv
2024-03-28 07:48:27 +00:00
hutli
b1babcfd2f nix: .#widnows: init
initial nix build for windows using zig

mingwW64 build

removes nix zig windows build

removes nix zig windows build

removed unnessesary glibc.static

removed unnessesary import of pkgs in nix

fixed missing trailing newline on non-windows nix builds

overriding stdenv when building for crosscompiling to windows in nix

better variables when crosscompiling windows in nix

cross compile windows on macos

removed trailing whitespace

remove unnessesary overwrite of "CMAKE_SYSTEM_NAME" in nix windows build

nix: keep file extension when copying result files during cross compile for windows

nix: better checking for file extensions when using MinGW

nix: using hostPlatform instead of targetPlatform when cross compiling for Windows

using hostPlatform.extensions.executable to extract executable format
2024-03-28 07:48:27 +00:00
Someone Serge
5becb0a105 nix: ci: dont test cuda and rocm (for now)
Until https://github.com/ggerganov/llama.cpp/issues/6346 is resolved
2024-03-27 19:18:55 +00:00
Tushar
2d596a57fd build(nix): Introduce flake.formatter for nix fmt (#5687)
* build(nix): Introduce flake.formatter for `nix fmt`
* chore: Switch to pkgs.nixfmt-rfc-style
2024-03-01 15:18:26 -08:00
Mathijs de Bruin
a0a954047e nix: now that we can do so, allow MacOS to build Vulkan binaries
Author:    Philip Taron <philip.taron@gmail.com>
Date:      Tue Feb 13 20:28:02 2024 +0000
2024-02-19 14:49:49 -08:00
Martin Schwaighofer
0d266df204 add Vulkan support to Nix flake 2024-02-03 13:13:07 -06:00
Someone Serge
7cf6f6f7e7 flake.nix: add a comment about flakes vs nix 2024-01-22 12:19:30 +00:00
Philip Taron
8caa88c9e3 nix: remove nixConfig from flake.nix (#4984) 2024-01-16 09:56:21 -08:00
Someone Serge
f0542c5698 flake.nix: suggest the binary caches 2023-12-31 13:14:58 -08:00
Someone Serge
d01cd58a5f flake.nix: expose checks 2023-12-31 13:14:58 -08:00
Someone Serge
198b325606 flake.nix: rocm not yet supported on aarch64, so hide the output 2023-12-31 13:14:58 -08:00
Someone Serge
8f71af30db flake.nix: expose full scope in legacyPackages 2023-12-31 13:14:58 -08:00
Philip Taron
97afbf4d29 flake.nix : rewrite (#4605)
* flake.lock: update to hotfix CUDA::cuda_driver

Required to support https://github.com/ggerganov/llama.cpp/pull/4606

* flake.nix: rewrite

1. Split into separate files per output.

2. Added overlays, so that this flake can be integrated into others.
   The names in the overlay are `llama-cpp`, `llama-cpp-opencl`,
   `llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the
   broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs).

3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/)
   rather than `with pkgs;` so that there's dependency injection rather
   than dependency lookup.

4. Add a description and meta information for each package.
   The description includes a bit about what's trying to accelerate each one.

5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge.

6. Format with `serokell/nixfmt` for a consistent style.

7. Update `flake.lock` with the latest goods.

* flake.nix: use finalPackage instead of passing it manually

* nix: unclutter darwin support

* nix: pass most darwin frameworks unconditionally

...for simplicity

* *.nix: nixfmt

nix shell github:piegamesde/nixfmt/rfc101-style --command \
    nixfmt flake.nix .devops/nix/*.nix

* flake.nix: add maintainers

* nix: move meta down to follow Nixpkgs style more closely

* nix: add missing meta attributes

nix: clarify the interpretation of meta.maintainers

nix: clarify the meaning of "broken" and "badPlatforms"

nix: passthru: expose the use* flags for inspection

E.g.:

```
❯ nix eval .#cuda.useCuda
true
```

* flake.nix: avoid re-evaluating nixpkgs too many times

* flake.nix: use flake-parts

* nix: migrate to pname+version

* flake.nix: overlay: expose both the namespace and the default attribute

* ci: add the (Nix) flakestry workflow

* nix: cmakeFlags: explicit OFF bools

* nix: cuda: reduce runtime closure

* nix: fewer rebuilds

* nix: respect config.cudaCapabilities

* nix: add the impure driver's location to the DT_RUNPATHs

* nix: clean sources more thoroughly

...this way outPaths change less frequently,
and so there are fewer rebuilds

* nix: explicit mpi support

* nix: explicit jetson support

* flake.nix: darwin: only expose the default

---------

Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>
2023-12-29 16:42:26 +02:00
Tungsten842
093afcb68f flake.nix: fix for rocm 5.7 (#3853) 2023-10-31 19:24:03 +02:00
Erik Scholz
2546cef587 flake : update flake.lock for newer transformers version + provide extra dev shell (#3797)
* flake : update flake.lock for newer transformers version + provide extra dev shell with torch and transformers (for most convert-xxx.py scripts)
2023-10-28 16:41:07 +02:00
Eve
482d162480 cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273)
* fix LLAMA_NATIVE

* syntax

* alternate implementation

* my eyes must be getting bad...

* set cmake LLAMA_NATIVE=ON by default

* march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc

* revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile

* remove -DLLAMA_MPI=ON

---------

Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>
2023-10-03 19:53:15 +03:00
Erik Scholz
17a015ab3b nix : add cuda, use a symlinked toolkit for cmake (#3202) 2023-09-25 13:48:30 +02:00
kang
eeee397010 flake : Restore default package's buildInputs (#3262) 2023-09-20 15:48:22 +02:00
Evgeny Kurnevsky
45e8e203c0 flake : use pkg-config instead of pkgconfig (#3188)
pkgconfig is an alias, it got removed from nixpkgs:
295a5e1e2b/pkgs/top-level/aliases.nix (L1408)
2023-09-15 11:10:22 +03:00
jneem
144cf127a8 flake : allow $out/include to already exist (#3175) 2023-09-14 21:54:47 +03:00
Asbjørn Olling
0ca44fb618 flake : include llama.h in nix output (#3159) 2023-09-14 20:25:00 +03:00
takov751
0741785341 flake : add train-text-from-scratch to flake.nix (#3042) 2023-09-08 19:06:26 +03:00
Tungsten842
10f684465a flake.nix : add rocm support and cleanup (#2808) 2023-08-26 21:19:44 +03:00
Volodymyr Vitvitskyi
3f42f52f01 flake : build llama.cpp on Intel with nix (#2795)
Problem
-------
`nix build` fails with missing `Accelerate.h`.

Changes
-------
- Fix build of the llama.cpp with nix for Intel: add the same SDK frameworks as
for ARM
- Add `quantize` app to the output of nix flake
- Extend nix devShell with llama-python so we can use convertScript

Testing
-------
Testing the steps with nix:
1. `nix build`
Get the model and then
2. `nix develop` and then `python convert.py models/llama-2-7b.ggmlv3.q4_0.bin`
3. `nix run llama.cpp#quantize -- open_llama_7b/ggml-model-f16.gguf ./models/ggml-model-q4_0.bin 2`
4. `nix run llama.cpp#llama -- -m models/ggml-model-q4_0.bin -p "What is nix?" -n 400 --temp 0.8 -e -t 8`

Co-authored-by: Volodymyr Vitvitskyi <volodymyrvitvitskyi@SamsungPro.local>
2023-08-26 16:25:39 +03:00
Shouzheng Liu
d47e62c9da metal : matrix-matrix multiplication kernel (#2615)
* metal: matrix-matrix multiplication kernel

This commit removes MPS and uses custom matrix-matrix multiplication
kernels for all quantization types. This commit also adds grouped-query
attention to support llama2 70B.

* metal: fix performance degradation from gqa

Integers are slow on the GPU, and 64-bit divides are extremely slow.
In the context of GQA, we introduce a 64-bit divide that cannot be
optimized out by the compiler, which results in a decrease of ~8% in
inference performance. This commit fixes that issue by calculating a
part of the offset with a 32-bit divide. Naturally, this limits the
size of a single matrix to ~4GB. However, this limitation should
suffice for the near future.

* metal: fix bugs for GQA and perplexity test.

I mixed up ne02 and nb02 in previous commit.
2023-08-16 23:07:04 +03:00
wzy
d88eb2f671 flake : support nix build '.#opencl' (#2337) 2023-07-23 14:57:02 +03:00
wzy
716d8e181a flake : remove intel mkl from flake.nix due to missing files (#2277)
NixOS's mkl misses some libraries like mkl-sdl.pc. See #2261
Currently NixOS doesn't have intel C compiler (icx, icpx). See https://discourse.nixos.org/t/packaging-intel-math-kernel-libraries-mkl/975
So remove it from flake.nix

Some minor changes:

- Change pkgs.python310 to pkgs.python3 to keep latest
- Add pkgconfig to devShells.default
- Remove installPhase because we have `cmake --install` from #2256
2023-07-21 13:26:34 +03:00
wzy
c6e455a3ca flake : update flake.nix (#2270)
When `isx86_32 || isx86_64`, it will use mkl, else openblas

According to
https://discourse.nixos.org/t/rpath-of-binary-contains-a-forbidden-reference-to-build/12200/3,
add -DCMAKE_SKIP_BUILD_RPATH=ON

Fix #2261, Nix doesn't provide mkl-sdl.pc.
When we build with -DBUILD_SHARED_LIBS=ON, -DLLAMA_BLAS_VENDOR=Intel10_lp64
replace mkl-sdl.pc by mkl-dynamic-lp64-iomp.pc
2023-07-19 10:01:55 +03:00
Dave Della Costa
f308c3e432 flake : add runHook preInstall/postInstall to installPhase so hooks function (#2224) 2023-07-14 22:13:38 +03:00
Rowan Hart
e12d863012 flake : fix ggml-metal.metal path and run nixfmt (#1974) 2023-06-24 14:07:08 +03:00
Faez Shakil
fd7e289fca exposed modules so that they can be invoked by nix run github:ggerganov/llama.cpp#server etc (#1863) 2023-06-17 14:13:05 +02:00
Andrei
c6755e854f metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782)
* Fix issue with ggml-metal.metal path

* Add ggml-metal.metal as a resource for llama target

* Update flake.nix metal kernel substitution
2023-06-10 17:47:34 +03:00
jacobi petrucciani
4e1aea2999 flake : update to support metal on m1/m2 (#1724) 2023-06-07 07:15:31 +03:00
Pavol Rusnak
3c1c300ec5 nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py (#981) 2023-04-25 23:19:57 +02:00
Pavol Rusnak
147f0b769c py : cleanup dependencies (#962)
after #545 we do not need torch, tqdm and requests in the dependencies
2023-04-14 15:37:11 +02:00
Pavol Rusnak
787a6000c4 flake.nix: add all binaries from bin (#848) 2023-04-13 15:49:05 +02:00
lon
7e9de8684c Add new binaries to flake.nix (#847) 2023-04-08 12:04:23 +02:00
Ben Siraphob
bf0302d463 Fix Nix build 2023-03-23 17:51:26 +01:00
Ben Siraphob
bc283200bd Nix flake: set meta.mainProgram to llama 2023-03-20 22:50:22 +01:00
Niklas Korz
ec4dddf648 Nix flake (#40)
* Nix flake

* Nix: only add Accelerate framework on macOS

* Nix: development shel, direnv and compatibility

* Nix: use python packages supplied by withPackages

* Nix: remove channel compatibility

* Nix: fix ARM neon dotproduct on macOS

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17 23:03:48 +01:00