Files
ik_llama.cpp/github-data/pull_requests/70 - Fused unary_x_y.md
2025-07-23 13:31:53 +02:00

489 B

🔀 #70 - Fused unary(x)*y

Author ikawrakow
State Closed
Created 2024-09-30
Updated 2024-10-02

Description

This is useful for parallel FFNs. unary can be silu, gelu or relu.

Implemented for CPU, CUDA and Metal.

Speedup is disappointingly small (1-3% for PP, depending on platform and model).

Let me think some more if I want to merge it.