ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-24 15:14:10 +00:00

Author	SHA1	Message	Date
Radosław Gryta	ece0b572f3	ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (#5711 ) * [ggml-quants] Provide ggml_vqtbl1q_u8 for 64bit compatibility vqtbl1q_u8 is not part of arm v7 neon library * [android-example] Remove abi filter after arm v7a fix * [github-workflows] Do not skip Android armeabi-v7a build	2024-02-25 20:43:00 +02:00
Ananta Bastola	27a984488c	ci : add an option to fail on compile warning (#3952 ) * feat(ci): add an option to fail on compile warning * Update CMakeLists.txt * minor : fix compile warnings ggml-ci * ggml : fix unreachable code warnings ggml-ci * ci : disable fatal warnings for windows, ios and tvos * ggml : fix strncpy warning * ci : disable fatal warnings for MPI build * ci : add fatal warnings to ggml-ci ggml-ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-17 23:03:14 +02:00
Abhilash Majumder	ee83074b40	Fix f16_sycl cpy call from Arc (#5411 ) * fix f16_sycl cpy call * rm old logic * add fp16 build CI * use macro * format fix	2024-02-08 22:39:10 +05:30
Eve	01d251cd48	Fix broken Vulkan Cmake (properly) (#5230 ) * build vulkan as object * vulkan ci	2024-01-31 20:21:55 +01:00
Neo Zhang Jianyu	d021491248	support SYCL backend windows build (#5208 ) * support SYCL backend windows build * add windows build in CI * add for win build CI * correct install oneMKL * fix install issue * fix ci * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix win build * fix win build * fix win build * restore other CI part * restore as base * rm no new line * fix no new line issue, add -j * fix grammer issue * allow to trigger manually, fix format issue * fix format * add newline * fix format * fix format * fix format issuse --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-01-31 08:08:07 +05:30
Jared Van Bortel	dc86054ff4	Nomic Vulkan backend (#4456 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: niansa <anton-sa@web.de> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: ToKiNoBug <tokinobug@163.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: slaren <slarengh@gmail.com>	2024-01-29 15:50:50 -05:00
Abhilash Majumder	0efe0f7ed2	ggml : add unified SYCL backend for Intel GPUs (#2690 ) * first update for migration * update init_cublas * add debug functio, commit all help code * step 1 * step 2 * step3 add fp16, slower 31->28 * add GGML_LIST_DEVICE function * step 5 format device and print * step6, enhance error check, remove CUDA macro, enhance device id to fix none-zero id issue * support main device is non-zero * step7 add debug for code path, rm log * step 8, rename all macro & func from cuda by sycl * fix error of select non-zero device, format device list * ren ggml-sycl.hpp -> ggml-sycl.h * clear CMAKE to rm unused lib and options * correct queue: rm dtct:get_queue * add print tensor function to debug * fix error: wrong result in 658746bb26702e50f2c59c0e4ada8e9da6010481 * summary dpct definition in one header file to replace folder:dpct * refactor device log * mv dpct definition from folder dpct to ggml-sycl.h * update readme, refactor build script * fix build with sycl * set nthread=1 when sycl, increase performance * add run script, comment debug code * add ls-sycl-device tool * add ls-sycl-device, rm unused files * rm rear space * dos2unix * Update README_sycl.md * fix return type * remove sycl version from include path * restore rm code to fix hang issue * add syc and link for sycl readme * rm original sycl code before refactor * fix code err * add know issue for pvc hang issue * enable SYCL_F16 support * align pr4766 * check for sycl blas, better performance * cleanup 1 * remove extra endif * add build&run script, clean CMakefile, update guide by review comments * rename macro to intel hardware * editor config format * format fixes * format fixes * editor format fix * Remove unused headers * skip build sycl tool for other code path * replace tab by space * fix blas matmul function * fix mac build * restore hip dependency * fix conflict * ren as review comments * mv internal function to .cpp file * export funciton print_sycl_devices(), mv class dpct definition to source file * update CI/action for sycl code, fix CI error of repeat/dup * fix action ID format issue * rm unused strategy * enable llama_f16 in ci * fix conflict * fix build break on MacOS, due to CI of MacOS depend on external ggml, instead of internal ggml * fix ci cases for unsupported data type * revert unrelated changed in cuda cmake remove useless nommq fix typo of GGML_USE_CLBLAS_SYCL * revert hip cmake changes * fix indent * add prefix in func name * revert no mmq * rm cpu blas duplicate * fix no_new_line * fix src1->type==F16 bug. * pass batch offset for F16 src1 * fix batch error * fix wrong code * revert sycl checking in test-sampling * pass void as arguments of ggml_backend_sycl_print_sycl_devices * remove extra blank line in test-sampling * revert setting n_threads in sycl * implement std::isinf for icpx with fast math. * Update ci/run.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update examples/sycl/run-llama2.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update examples/sycl/run-llama2.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update CMakeLists.txt Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update CMakeLists.txt Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update CMakeLists.txt Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update CMakeLists.txt Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add copyright and MIT license declare * update the cmd example --------- Co-authored-by: jianyuzh <jianyu.zhang@intel.com> Co-authored-by: luoyu-intel <yu.luo@intel.com> Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-01-28 17:56:23 +02:00
crasm	f4cc7db364	ci : add model tests + script wrapper (#4586 ) * scripts : add lib.sh and lib_test.sh * scripts : stub out new ci-run.sh script * scripts : switch to PascalCase for functions This looks a little odd at first, but I find it very useful as a convention to know if a command is part of our code vs a builtin. * scripts : add some fancy conversion from snake_case to PascalCase * Add venv to ci/run.sh * Revert scripts work * scripts : add wrapper script for local use of ci/run.sh * Simplify .gitignore for tests, clang-tidy fixes * Label all ctest tests * ci : ctest uses -L main * Attempt at writing ctest_with_model * Update test-model-load-cancel * ci : add ctest_with_model for debug and release ggml-ci * Fix gg_get_model function ggml-ci * got stuck on CMake * Add get_model.cpp to tests/CMakeLists.txt ggml-ci * Fix README.md output for ctest_with_model ggml-ci * workflows : use `-L main` for all ctest ggml-ci * Fixes * GG_RUN_CTEST_MODELFILE => LLAMACPP_TESTMODELFILE * Always show warning rather than failing if model file variable is not set * scripts : update usage text for ci-run.sh	2024-01-26 14:18:00 +02:00
bobqianic	4aac1d433e	ci : fix Windows CI by updating Intel SDE version (#5053 )	2024-01-22 10:55:05 +02:00
Neuman Vong	0641f5e160	android : introduce starter project example (#4926 ) * Introduce starter project for Android Based on examples/llama.swiftui. * Add github workflow * Set NDK version * Only build arm64-v8a in CI * Sync bench code * Rename CI prop to skip-armeabi-v7a * Remove unused tests	2024-01-16 15:47:34 +02:00
Someone Serge	4be77e38c3	workflows: nix-ci: init; build flake outputs	2023-12-31 13:14:58 -08:00
Georgi Gerganov	8a8220f13a	sync : ggml (new ops, tests, backend, etc.) (#4359 ) * sync : ggml (part 1) * sync : ggml (part 2, CUDA) * sync : ggml (part 3, Metal) * ggml : build fixes ggml-ci * cuda : restore lost changes * cuda : restore lost changes (StableLM rope) * cmake : enable separable compilation for CUDA ggml-ci * ggml-cuda : remove device side dequantize * Revert "cmake : enable separable compilation for CUDA" This reverts commit 09e35d04b1c4ca67f9685690160b35bc885a89ac. * cuda : remove assert for rope * tests : add test-backend-ops * ggml : fix bug in ggml_concat * ggml : restore `ggml_get_n_tasks()` logic in `ggml_graph_plan()` * ci : try to fix macOS * ggml-backend : remove backend self-registration * ci : disable Metal for macOS cmake build ggml-ci * metal : fix "supports family" call * metal : fix assert * metal : print resource path ggml-ci --------- Co-authored-by: slaren <slarengh@gmail.com>	2023-12-07 22:26:54 +02:00
Bailey Chittle	a6a660c556	examples : iOS example with swift ui (#4159 ) * copy to llama.cpp as subdir * attempt enabling metal, fails * ggml metal compiles! * Update README.md * initial conversion to new format, utf8 errors? * bug fixes, but now has an invalid memory access :( * added O3, now has insufficient memory access * begin sync with master * update to match latest code, new errors * fixed it! * fix for loop conditionals, increase result size * fix current workflow errors * attempt a llama.swiftui workflow * Update .github/workflows/build.yml Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-27 16:56:52 +02:00
Eve	67a2970f9f	ci : use intel sde when ci cpu doesn't support avx512 (#3949 )	2023-11-05 09:46:44 +02:00
Zane Shannon	dcdafa74c6	examples : add batched.swift + improve CI for swift (#3562 )	2023-10-11 06:14:05 -05:00
Georgi Gerganov	cd058b6357	ci : enable on obj-c changes + fix metal build (#3540 )	2023-10-08 11:24:50 +03:00
Jhen-Jie Hong	c5d50453ad	ci : fix xcodebuild destinations (#3491 ) * ci : fix xcodebuild destinations * ci : add .swift to paths	2023-10-06 13:36:43 +03:00
Jhen-Jie Hong	9a72ae1535	ci : add swift build via xcodebuild (#3482 )	2023-10-05 16:56:21 +03:00
Eve	482d162480	cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273 ) * fix LLAMA_NATIVE * syntax * alternate implementation * my eyes must be getting bad... * set cmake LLAMA_NATIVE=ON by default * march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc * revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile * remove -DLLAMA_MPI=ON --------- Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>	2023-10-03 19:53:15 +03:00
Eve	eee144f02a	ci : multithreaded builds (#3311 ) * mac and linux threads * windows * Update build.yml * Update build.yml * Update build.yml * automatically get thread count * windows syntax * try to fix freebsd * Update build.yml * Update build.yml * Update build.yml	2023-09-28 22:31:04 +03:00
Georgi Gerganov	9c2ff9b6b8	ci : disable freeBSD builds due to lack of VMs (#3381 )	2023-09-28 19:36:36 +03:00
Alon	b68da9373d	CI: FreeBSD fix (#3258 ) * - freebsd ci: use qemu	2023-09-20 14:06:36 +02:00
Erik Scholz	d997ba652f	ci : switch cudatoolkit install on windows to networked (#3236 )	2023-09-18 02:21:47 +02:00
IsaacDynamo	36be4e8ee8	Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215 )	2023-09-16 19:35:25 +02:00
Cebtenzzre	217da58978	fix build numbers by setting fetch-depth=0 (#3197 )	2023-09-15 15:18:15 -04:00
Alon	d2b63333a9	CI: add FreeBSD & simplify CUDA windows (#3053 ) * add freebsd to ci * bump actions/checkout to v3 * bump cuda 12.1.0 -> 12.2.0 * bump Jimver/cuda-toolkit version * unify and simplify "Copy and pack Cuda runtime" * install only necessary cuda sub packages	2023-09-14 19:21:25 +02:00
Jhen-Jie Hong	e6d674b4cd	cmake : support build for iOS/tvOS (#3116 ) * cmake : support build for iOS/tvOS * ci : add iOS/tvOS build into macOS-latest-cmake * ci : split ios/tvos jobs	2023-09-11 19:49:06 +08:00
Alon	53afc99c41	cov : add Code Coverage and codecov.io integration (#2928 ) * update .gitignore * makefile: add coverage support (lcov, gcovr) * add code-coverage workflow * update code coverage workflow * wun on ubuntu 20.04 * use gcc-8 * check why the job hang * add env vars * add LLAMA_CODE_COVERAGE=1 again * - add CODECOV_TOKEN - add missing make lcov-report * install lcov * update make file -pb flag * remove unused GGML_NITER from workflows * wrap coverage output files in COV_TARGETS	2023-09-03 11:48:49 +03:00
alonfaraj	944a1ab5f1	make : add test and update CI (#2897 ) * build ci: run make test * makefile: - add all - add test * enable tests/test-tokenizer-0-llama * fix path to model * remove gcc-8 from macos build test * Update Makefile * Update Makefile	2023-08-30 12:42:51 +03:00
DannyDaemonic	a74a205f64	Tag release with build number (#2732 ) * Modified build.yml to use build number for release * Add the short hash back into the tag * Prefix the build number with b	2023-08-24 15:58:02 +02:00
Eve	b3331ff75f	ci : add non-AVX scalar build/test (#2356 ) * noavx build and test * we don't need to remove f16c in windows	2023-07-25 15:16:13 +03:00
Evan Miller	7282a23a5e	mpi : add support for distributed inference via MPI (#2099 ) * MPI support, first cut * fix warnings, update README * fixes * wrap includes * PR comments * Update CMakeLists.txt * Add GH workflow, fix test * Add info to README * mpi : trying to move more MPI stuff into ggml-mpi (WIP) (#2099) * mpi : add names for layer inputs + prep ggml_mpi_graph_compute() * mpi : move all MPI logic into ggml-mpi Not tested yet * mpi : various fixes - communication now works but results are wrong * mpi : fix output tensor after MPI compute (still not working) * mpi : fix inference * mpi : minor * Add OpenMPI to GH action * [mpi] continue-on-error: true * mpi : fix after master merge * [mpi] Link MPI C++ libraries to fix OpenMPI * tests : fix new llama_backend API * [mpi] use MPI_INT32_T * mpi : factor out recv / send in functions and reuse * mpi : extend API to allow usage with outer backends (e.g. Metal) --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-10 18:49:56 +03:00
Georgi Gerganov	d94ea1d98a	ci : switch threads to 1 (#2138 )	2023-07-07 21:23:57 +03:00
Qingyou Meng	d259d42719	ggml : change ggml_graph_compute() API to not require context (#1999 ) * ggml_graph_compute: deprecate using ggml_context, try resolve issue #287 * rewrite: no longer consider backward compitability; plan and make_plan * minor: rename ctx as plan; const * remove ggml_graph_compute from tests/test-grad0.c, but current change breaks backward * add static ggml_graph_compute_sugar() * minor: update comments * reusable buffers * ggml : more consistent naming + metal fixes * ggml : fix docs * tests : disable grad / opt + minor naming changes * ggml : add ggml_graph_compute_with_ctx() - backwards compatible API - deduplicates a lot of copy-paste * ci : enable test-grad0 * examples : factor out plan allocation into a helper function * llama : factor out plan stuff into a helper function * ci : fix env * llama : fix duplicate symbols + refactor example benchmark * ggml : remove obsolete assert + refactor n_tasks section * ggml : fix indentation in switch * llama : avoid unnecessary bool * ggml : remove comments from source file and match order in header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-07 19:24:01 +03:00
Stephan Walter	e0a5b08cdc	ggml : generalize `quantize_fns` for simpler FP16 handling (#1237 ) * Generalize quantize_fns for simpler FP16 handling * Remove call to ggml_cuda_mul_mat_get_wsize * ci : disable FMA for mac os actions --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-05 19:13:06 +03:00
Erik Scholz	bdf2652c83	CI: make the brew update temporarily optional. (#2092 ) until they decide to fix the brew installation in the macos runners. see the open issues. eg https://github.com/actions/runner-images/pull/7710	2023-07-04 01:50:12 +02:00
slaren	1200071552	ci : run when changing only the CUDA sources (#1800 )	2023-06-12 20:12:47 +03:00
Kerfuffle	f989fcc1ac	Include server in releases + other build system cleanups (#1610 ) Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases. Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default) Fix issue where `vdot` binary wasn't removed when running `make clean`. Fix compile warnings in `server` example. Add `.hpp` files to trigger workflow (the server example has one).	2023-05-27 11:04:14 -06:00
Henri Vasserman	6a7be3e13b	[CI] Fix openblas (#1613 ) * Fix OpenBLAS build * Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.	2023-05-27 17:24:06 +03:00
Henri Vasserman	2097ef1048	[CI] CLBlast: Fix directory name (#1606 )	2023-05-27 14:18:25 +02:00
Henri Vasserman	88c9a3ad7b	Update CLBlast to 1.6.0 (#1580 ) * Update CLBlast to 1.6.0	2023-05-24 10:30:09 +03:00
Zenix	3d683d65fc	feature : support blis and other blas implementation (#1536 ) * feature: add blis support * feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927 * fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake * Fix typo in INTEGER Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Fix: blas changes on ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-05-20 17:58:31 +03:00
Henri Vasserman	61a2558d6d	CI: add Windows CLBlast and OpenBLAS builds (#1277 ) * Add OpenCL and CLBlast support * Add OpenBLAS support * Remove testing from matrix * change build name to 'clblast'	2023-05-07 13:20:09 +02:00
Erik Scholz	12b54596b3	ci : add cublas to windows release (#1271 )	2023-05-05 22:56:09 +02:00
Stephan Walter	c62ac69921	Fix build for gcc 8 and test in CI (#1154 )	2023-04-24 15:38:26 +00:00
Stephan Walter	90da0b75a3	ci : trigger CI for drafts, but not most PR actions (#1125 )	2023-04-22 16:12:29 +03:00
Howard Su	f9a61db2de	cmake : fix build under Windows when enable BUILD_SHARED_LIBS (#1100 ) * Fix build under Windows when enable BUILD_SHARED_LIBS * Make AVX512 test on Windows to build the shared libs	2023-04-22 11:18:20 +03:00
Ivan Komarov	59f4d32a01	ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the CI (#1074 ) [Accelerate](https://developer.apple.com/documentation/accelerate) is an Apple framework which can only be used on macOS, and the CMake build [ignores](https://github.com/ggerganov/llama.cpp/blob/master/CMakeLists.txt#L102) the `LLAMA_ACCELERATE` variable when run on non-Apple platforms. This implies setting `LLAMA_ACCELERATE` is a no-op on Ubuntu and can be removed. This will reduce visual noise in CI check results (in addition to reducing the number of checks we have to run for every PR). Right now every sanitized build is duplicated twice for no good reason (e.g., we have `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, ON)` and `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, OFF)`).	2023-04-20 18:15:18 +03:00
Georgi Gerganov	b2ef9f4eae	ci : do not run on drafts	2023-04-18 19:57:06 +03:00
anzz1	357f21576e	ci : re-enable AVX512 testing (Windows-MSVC) (#584 ) * CI: Re-enable AVX512 testing (Windows-MSVC) Now with 100% less base64 encoding * plain __cpuid is enough here	2023-03-29 23:44:39 +03:00

1 2

65 Commits