mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-07 04:20:03 +00:00
Fused delta-net (#1315)
* Revive fused delta-net * Add command line argument for fused delta net * Simplify/improve CUDA delta-net * Add -fdn to llama-bench * More CUDA fused delta net optimizations * CPU optimizations * Much faster fused delta-net on the CPU It seems it is faster than the chunked implementation! * Change meaning of fdn from bool flag to threshold value * Use eps = 1e-6 * Give some nodes a name
This commit is contained in:
@@ -456,6 +456,7 @@ extern "C" {
|
||||
bool split_mode_graph_scheduling; // if true, force split mode graph scheduling
|
||||
//bool split_mode_f16; // if true, cast intermediate results to f16 before copying to other GPUs
|
||||
bool scheduler_async; // if true, with split mode "graph" graph evaluation will be done using multiple threads
|
||||
int fused_delta_net;
|
||||
bool mtp; // Activate MTP if supported
|
||||
enum llama_mtp_op_type mtp_op_type;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user