ik_llama.cpp/56 - BF16 support on Metal.md at ik/debug_issue_721 - ik_llama.cpp

ikawrakow/ik_llama.cpp

Fork 0

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-27 00:24:11 +00:00

Files

Thomas 0451f10a42 Add GitHub data: filename sanitization (#640 )

2025-07-23 13:31:53 +02:00

1.2 KiB

Raw Permalink Blame History

🔀 #56 - BF16 support on Metal

Author	`ikawrakow`
State	❌ Closed
Created	2024-09-16
Updated	2024-09-17

Description

It is slightly slower than fp16, but definitely a massive improvement compared to not having bf16 support at al. ~~Didn't put any effort into optimizing the matrix x vector kernel, so it is likely one can improve bf16 TG performance~~.

model	size	params	backend	ngl	test	t/s
llama 8B BF16	14.96 GiB	8.03 B	Metal	100	pp512	538.84 ± 0.26
llama 8B F16	14.96 GiB	8.03 B	Metal	100	pp512	587.26 ± 0.39
llama 8B BF16	14.96 GiB	8.03 B	Metal	100	tg128	21.64 ± 0.05
llama 8B F16	14.96 GiB	8.03 B	Metal	100	tg128	21.77 ± 0.03

1.2 KiB Raw Permalink Blame History

🔀 #56 - BF16 support on Metal

Description

1.2 KiB

Raw Permalink Blame History