ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-05-11 08:30:19 +00:00

Files

Lennart Lopin fd37776584 Add ARM Grace Blackwell (NVIDIA DGX Spark) support (#922 )

This commit enables IQK quantization operations on ARM-based systems,
specifically tested on NVIDIA DGX Spark with GB10 Grace Blackwell.

Changes:
- Enable IQK_IMPLEMENT macro for ARM NEON operations
- Add arm_neon.h header include for ARM SIMD intrinsics
- Fix compilation errors related to missing NEON types and functions

Build requirements for ARM:
  cmake .. -DGGML_CUDA=ON \
           -DCMAKE_CXX_FLAGS="-march=armv8.2-a+dotprod+fp16" \
           -DCMAKE_C_FLAGS="-march=armv8.2-a+dotprod+fp16"

Tested on:
- Platform: NVIDIA DGX Spark (aarch64)
- CPU: GB10 Grace Blackwell Superchip
- Memory: 128GB unified memory

Fixes build errors:
- 'float32x4_t' does not name a type
- 'vld1q_f32' was not declared in this scope
- 'v_expf' was not declared in this scope
- Missing FP16 NEON intrinsics

2025-11-09 14:22:40 +02:00

cmake

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

include

CUDA: set compute parameters via command line arguments (#910 )

2025-11-07 07:11:23 +02:00

src

Add ARM Grace Blackwell (NVIDIA DGX Spark) support (#922 )

2025-11-09 14:22:40 +02:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

Disable CUDA fusion by default for now (#903 )

2025-11-05 10:58:12 +02:00