ik_llama.cpp/github-data/issues/203-Bug_ Compliation Error for Intel(R) Xeon(R) Gold 6326 CPU.md at 3600d82e986ab91ec8996a7ebf15168da2fec34e

ikawrakow/ik_llama.cpp

Fork 0

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-05-01 11:51:53 +00:00

Files

Thomas 94aa54df76 Add GitHub data (#637 )

2025-07-22 18:18:40 +02:00

5.0 KiB

Raw Blame History

🐛 #203 - Bug: Compliation Error for Intel(R) Xeon(R) Gold 6326 CPU

Author	`Flying-Cloud`
State	❌ Closed
Created	2025-02-12
Updated	2025-02-12

Description

What happened?

Hello! I found some error when build ik_llama.cpp project. Running the command 'cmake --build build --config Release' I found errors in that the cpu in my system Intel(R) Xeon(R) Gold 6326 CPU does not support AVX512BF16 but do support other AVX512 features. So when compling iqk_mul_mat.cpp, encounter errors for BF16 data. Can you help me fix this error, or some suggestions for me to fix. Thanks!

llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp: In instantiation of ‘{anonymous}::QFBase::Data {anonymous}::QFT<Float, nrc_in>::load1(int, int) const [with Float = ggml_bf16_t; int nrc_in = 1; {anonymous}::QFBase::Data = __vector(16) float]’:
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8249:10:   required from ‘void {anonymous}::mul_mat_Qx_Qy_MxN(int, const char*, size_t, int, const {anonymous}::DataInfo&) [with Qy = {anonymous}::QFT<ggml_bf16_t, 1>; Qx = {anonymous}::QFT<ggml_bf16_t, 5>; size_t = long unsigned int]’
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8362:65:   required from ‘void {anonymous}::mul_mat_fX_fY_T(int, const void*, size_t, const {anonymous}::DataInfo&, int) [with int nrc_y = 1; FloatX = ggml_bf16_t; FloatY = ggml_bf16_t; size_t = long unsigned int]’
llm/ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8643:17:   required from ‘void {anonymous}::set_mul_mat_f({anonymous}::MulMat&) [with FloatX = ggml_bf16_t; FloatY = ggml_bf16_t]’
ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8685:76:   required from here
ik_llama.cpp/ggml/src/iqk/iqk_mul_mat.cpp:8173:68: error: no matching function for call to ‘{anonymous}::QFT<ggml_bf16_t, 1>::load(const ggml_bf16_t*) const’
 8173 |     IQK_ALWAYS_INLINE Data load1(int iy, int i) const { return load(y[iy] + k_step*i); }

Name and Version

Intel(R) Xeon(R) Gold 6326 CPU Ubuntu 20.04

What operating system are you seeing the problem on?

Linux

Relevant log output

💬 Conversation

👤 Flying-Cloud commented the 2025-02-12 at 08:13:39:

I have added the overload function for bf16 as follows, which resolved the compilation issue in iqk_mul_mat.cpp. I am not quite sure if it is right functionally but it did fix the compliation bug

static inline Data load(const ggml_bf16_t * x) {
        // Load BF16 data into __m256i
        __m256i bf16_data = _mm256_loadu_si256((const __m256i *)x);
        // Convert BF16 to FP32 by shifting left 16 bits
        __m512i bf16_extended = _mm512_slli_epi32(_mm512_cvtepu16_epi32(bf16_data), 16);
        // Cast to __m512 (FP32)
        return _mm512_castsi512_ps(bf16_extended);
    }

👤 ikawrakow commented the 2025-02-12 at 08:18:53:

Yes, this is the right fix. I have disabled BF16 on my CPU and tested that PR #204 works correctly (not a very thorough testing, but token generation and perplexity seem fine).

Thank you for the report! It is always helpful when things get tested on more diverse systems. Let me know if #204 works correctly for you.

👤 ikawrakow commented the 2025-02-12 at 08:18:53:

Yes, this is the right fix. I have disabled BF16 on my CPU and tested that PR #204 works correctly (not a very thorough testing, but token generation and perplexity seem fine).

👤 Flying-Cloud commented the 2025-02-12 at 11:28:50:

Yes, this is the right fix. I have disabled BF16 on my CPU and tested that PR #204 works correctly (not a very thorough testing, but token generation and perplexity seem fine).

Thank you for the report! It is always helpful when things get tested on more diverse systems. Let me know if #204 works correctly for you.

Lines 16082 in iqk_mul_mat.cpp should be changed from

#ifdef HAVE_FANCY_SIMD
        case GGML_TYPE_BF16: {
            HelperBF16<Dv, k_step> vh(v, stride_v);
            iqk_flash_helper<Dk, Dv, k_step>(kh, vh, nq1, nk1, stride_q, stride_m, stride_qkv, q, mask, scale, softcap, qkv);
        } break;
#endif

#if defined(HAVE_FANCY_SIMD) && defined(__AVX512BF16__)
        case GGML_TYPE_BF16: {
            HelperBF16<D, k_step> vh(v, stride_v);
            iqk_flash_helper<D, k_step>(kh, vh, nq1, nk1, stride_q, stride_m, stride_qkv, q, mask, scale, softcap, qkv);
        } break;
#endif

Otherwise, there will still be error that HelperBF16 not defined

👤 ikawrakow commented the 2025-02-12 at 11:49:04:

Do you want to submit a PR (I'll close #204 if you do). Or do you want me to add it to #204?

👤 Flying-Cloud commented the 2025-02-12 at 11:51:48:

For convenience, add it to #204 is fined. There is no other issue when add these two codes, thanks for your effort

5.0 KiB Raw Blame History Unescape Escape