Files
ComfyUI/comfy
rattus ae79e33345 llama: use a more efficient rope implementation (#12434)
Get rid of the cat and unary negation and inplace add-cmul the two
halves of the rope. Precompute -sin once at the start of the model
rather than every transformer block.

This is slightly faster on both GPU and CPU bound setups.
2026-02-12 19:56:42 -05:00
..
2024-06-27 18:43:11 -04:00
2025-01-24 06:15:54 -05:00
2025-07-06 07:07:39 -04:00
2026-01-01 22:06:14 -05:00