Add GitHub data: filename sanitization (#640)

This commit is contained in:
Thomas
2025-07-23 13:31:53 +02:00
committed by GitHub
parent 7093a35869
commit 0451f10a42
626 changed files with 0 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
### 🔀 [#42](https://github.com/ikawrakow/ik_llama.cpp/pull/42) - Adding fused rms_norm
| **Author** | `ikawrakow` |
| :--- | :--- |
| **State** | ❌ **Closed** |
| **Created** | 2024-09-08 |
| **Updated** | 2024-09-08 |
---
#### Description
Many models have one or more of `rms_norm` followed by multiplication with a normalization tensor that is (almost) always just a single row. Fusing these two operations into a single op reduces thread synchronization cost and thus has the potential to improve performance, especially for relatively small models.
This PR adds this fused operation with implementations for the CPU, CUDA and Metal. We get about 1% speedup for PP and TG for Gemma2-2b on all implemented platforms. If we look at a tiny model such as the 99M parameter ternary TriLM, performance improvement is in the range of 5-7%.