ik_llama.cpp/llama.cpp at 8ceb2442b716ac1da220a784e4572797df480eb5

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-07 12:30:08 +00:00

Files

Georgi Gerganov e40aa5185e ggml : adjust mul_mat_f16 work memory (#1226 )

* llama : minor - remove explicity int64_t cast

* ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS

* ggml : add asserts to guard for incorrect wsize

2023-04-29 18:43:28 +03:00

92 KiB

Raw Blame History

View Raw

92 KiB Raw Blame History

92 KiB

Raw Blame History