Files
ik_llama.cpp/github-data/issues/203-Bug_ Compliation Error for Intel(R) Xeon(R) Gold 6326 CPU.md
2025-07-22 18:18:40 +02:00

115 lines
5.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
### 🐛 [#203](https://github.com/ikawrakow/ik_llama.cpp/issues/203) - Bug: Compliation Error for Intel(R) Xeon(R) Gold 6326 CPU
| **Author** | `Flying-Cloud` |
| :--- | :--- |
| **State** | ❌ **Closed** |
| **Created** | 2025-02-12 |
| **Updated** | 2025-02-12 |
---
#### Description
### What happened?
Hello! I found some error when build ik_llama.cpp project. Running the command 'cmake --build build --config Release'
I found errors in that the cpu in my system Intel(R) Xeon(R) Gold 6326 CPU does not support AVX512BF16 but do support other AVX512 features.
So when compling iqk_mul_mat.cpp, encounter errors for BF16 data.
Can you help me fix this error, or some suggestions for me to fix. Thanks!
```
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp: In instantiation of {anonymous}::QFBase::Data {anonymous}::QFT<Float, nrc_in>::load1(int, int) const [with Float = ggml_bf16_t; int nrc_in = 1; {anonymous}::QFBase::Data = __vector(16) float]:
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8249:10: required from void {anonymous}::mul_mat_Qx_Qy_MxN(int, const char*, size_t, int, const {anonymous}::DataInfo&) [with Qy = {anonymous}::QFT<ggml_bf16_t, 1>; Qx = {anonymous}::QFT<ggml_bf16_t, 5>; size_t = long unsigned int]
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8362:65: required from void {anonymous}::mul_mat_fX_fY_T(int, const void*, size_t, const {anonymous}::DataInfo&, int) [with int nrc_y = 1; FloatX = ggml_bf16_t; FloatY = ggml_bf16_t; size_t = long unsigned int]
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8643:17: required from void {anonymous}::set_mul_mat_f({anonymous}::MulMat&) [with FloatX = ggml_bf16_t; FloatY = ggml_bf16_t]
ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8685:76: required from here
ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8173:68: error: no matching function for call to {anonymous}::QFT<ggml_bf16_t, 1>::load(const ggml_bf16_t*) const
8173 | IQK_ALWAYS_INLINE Data load1(int iy, int i) const { return load(y[iy] + k_step*i); }
```
### Name and Version
Intel(R) Xeon(R) Gold 6326 CPU Ubuntu 20.04
### What operating system are you seeing the problem on?
Linux
### Relevant log output
```shell
```
---
#### 💬 Conversation
👤 **Flying-Cloud** commented the **2025-02-12** at **08:13:39**:<br>
I have added the overload function for bf16 as follows, which resolved the compilation issue in iqk_mul_mat.cpp.
I am not quite sure if it is right functionally but it did fix the compliation bug
```
static inline Data load(const ggml_bf16_t * x) {
// Load BF16 data into __m256i
__m256i bf16_data = _mm256_loadu_si256((const __m256i *)x);
// Convert BF16 to FP32 by shifting left 16 bits
__m512i bf16_extended = _mm512_slli_epi32(_mm512_cvtepu16_epi32(bf16_data), 16);
// Cast to __m512 (FP32)
return _mm512_castsi512_ps(bf16_extended);
}
```
---
👤 **ikawrakow** commented the **2025-02-12** at **08:18:53**:<br>
Yes, this is the right fix. I have disabled `BF16` on my CPU and tested that PR #204 works correctly (not a very thorough testing, but token generation and perplexity seem fine).
Thank you for the report! It is always helpful when things get tested on more diverse systems. Let me know if #204 works correctly for you.
---
👤 **ikawrakow** commented the **2025-02-12** at **08:18:53**:<br>
Yes, this is the right fix. I have disabled `BF16` on my CPU and tested that PR #204 works correctly (not a very thorough testing, but token generation and perplexity seem fine).
---
👤 **Flying-Cloud** commented the **2025-02-12** at **11:28:50**:<br>
> Yes, this is the right fix. I have disabled `BF16` on my CPU and tested that PR [#204](https://github.com/ikawrakow/ik_llama.cpp/pull/204) works correctly (not a very thorough testing, but token generation and perplexity seem fine).
>
> Thank you for the report! It is always helpful when things get tested on more diverse systems. Let me know if [#204](https://github.com/ikawrakow/ik_llama.cpp/pull/204) works correctly for you.
Lines 16082 in iqk_mul_mat.cpp should be changed from
```
#ifdef HAVE_FANCY_SIMD
case GGML_TYPE_BF16: {
HelperBF16<Dv, k_step> vh(v, stride_v);
iqk_flash_helper<Dk, Dv, k_step>(kh, vh, nq1, nk1, stride_q, stride_m, stride_qkv, q, mask, scale, softcap, qkv);
} break;
#endif
```
to
```
#if defined(HAVE_FANCY_SIMD) && defined(__AVX512BF16__)
case GGML_TYPE_BF16: {
HelperBF16<D, k_step> vh(v, stride_v);
iqk_flash_helper<D, k_step>(kh, vh, nq1, nk1, stride_q, stride_m, stride_qkv, q, mask, scale, softcap, qkv);
} break;
#endif
```
Otherwise, there will still be error that HelperBF16 not defined
---
👤 **ikawrakow** commented the **2025-02-12** at **11:49:04**:<br>
Do you want to submit a PR (I'll close #204 if you do). Or do you want me to add it to #204?
---
👤 **Flying-Cloud** commented the **2025-02-12** at **11:51:48**:<br>
For convenience, add it to #204 is fined. There is no other issue when add these two codes, thanks for your effort