diff --git a/README.md b/README.md index 6415234a..d994df92 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,8 @@ cmake -B build -DGGML_NATIVE=ON -DGGML_CUDA=ON cmake --build build --config Release -j$(nproc) ``` +### Step-by-step instructions for a case of a successful Windows build +https://github.com/ikawrakow/ik_llama.cpp/blob/main/docs/build.md ### Run diff --git a/docs/build.md b/docs/build.md index 8b16d1a3..ca7ec83b 100644 --- a/docs/build.md +++ b/docs/build.md @@ -1,4 +1,5 @@ # Build llama.cpp locally +Typical build is aimed at CPU + GPU split and requires pre-installation of numerous tools which can bring mess to the configuration of your main OS if you're on Windows. To avoid this, one may make their builds in a virtual machine with Windows 10. For such cases, make sure you have a way to copy files from the VM to the host OS, e.g. via RDP. So, Windows users, consider doing the following actions in a VM. **To get the Code:** @@ -61,16 +62,111 @@ In order to build llama.cpp you have four different options. cmake --build build --config Debug ``` - Building for Windows (x86, x64 and arm64) with MSVC or clang as compilers: - - Install Visual Studio 2022, e.g. via the [Community Edition](https://visualstudio.microsoft.com/de/vs/community/). In the installer, select at least the following options (this also automatically installs the required additional tools like CMake,...): - - Tab Workload: Desktop-development with C++ - - Tab Components (select quickly via search): C++-_CMake_ Tools for Windows, _Git_ for Windows, C++-_Clang_ Compiler for Windows, MS-Build Support for LLVM-Toolset (clang) - - Please remember to always use a Developer Command Prompt / PowerShell for VS2022 for git, build, test - - For Windows on ARM (arm64, WoA) build with: - ```bash +
git.exe clone https://github.com/ggml-org/llama.cpp "C:\Downloads\ik_llama.cpp_git" from cmd and cd "C:\Downloads\ik_llama.cpp_git"
+ set VS_DIR=c:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools
+ call "%VS_DIR%\VC\Auxiliary\Build\vcvarsall.bat" x64
+ set LLVM_DIR=c:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/Llvm/x64
+ set CUDA_DIR=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6
+ set "PATH=%LLVM_DIR%/bin;%CUDA_DIR%/bin;%PATH%"
+ "c:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" ^
+ -G Ninja ^
+ -S "C:/Downloads/ik_llama.cpp_git" ^
+ -B "C:/Downloads/output_compilations" ^
+ -DCMAKE_C_COMPILER="%LLVM_DIR%/bin/clang-cl.exe" ^
+ -DCMAKE_CXX_COMPILER="%LLVM_DIR%/bin/clang-cl.exe" ^
+ -DCMAKE_CUDA_COMPILER="%CUDA_DIR%/bin/nvcc.exe" ^
+ -DCUDAToolkit_ROOT="%CUDA_DIR%" ^
+ -DCMAKE_CUDA_ARCHITECTURES="89-real" ^
+ -DCMAKE_BUILD_TYPE=Release ^
+ -DGGML_CUDA=ON ^
+ -DLLAMA_CURL=OFF ^
+ -DCMAKE_C_FLAGS="/clang:-march=znver4 /clang:-fvectorize /clang:-ffp-model=fast /clang:-fno-finite-math-only /clang:-Wno-format /clang:-Wno-unused-variable /clang:-Wno-unused-function /clang:-Wno-gnu-zero-variadic-macro-arguments" ^
+ -DCMAKE_CXX_FLAGS="/EHsc /clang:-march=znver4 /clang:-fvectorize /clang:-ffp-model=fast /clang:-fno-finite-math-only /clang:-Wno-format /clang:-Wno-unused-variable /clang:-Wno-unused-function /clang:-Wno-gnu-zero-variadic-macro-arguments" ^
+ -DCMAKE_CUDA_STANDARD=17 ^
+ -DGGML_AVX512=ON ^
+ -DGGML_AVX512_VNNI=ON ^
+ -DGGML_AVX512_VBMI=ON ^
+ -DGGML_CUDA_USE_GRAPHS=ON ^
+ -DGGML_SCHED_MAX_COPIES=1 ^
+ -DGGML_OPENMP=ON
+ "c:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" --build "C:/Downloads/output_compilations" --config Release
+
+ bash
cmake --preset arm64-windows-llvm-release -D GGML_OPENMP=OFF
cmake --build build-arm64-windows-llvm-release
- ```
- Note: Building for arm64 could also be done just with MSVC (with the build-arm64-windows-MSVC preset, or the standard CMake build instructions). But MSVC does not support inline ARM assembly-code, used e.g. for the accelerated Q4_0_4_8 CPU kernels.
+
+