Files
ik_llama.cpp/github-data/issues/263 - Benchmarking DeepSeek R1 - 16x3090.md
2025-07-23 13:31:53 +02:00

73 KiB

📝 #263 - Benchmarking DeepSeek R1 - 16x3090

Author davidsyoung
State Closed
Created 2025-03-18
Updated 2025-03-18

Description

Wanted to create a resource for anyone looking to optimise -b -ub -amb with -mla 2 -fa -fmoe with offloading DeepSeek R1 fully on CUDA with ik_llama.cpp @ dcdfad29f7.

Layers are not evenly spread over 16 GPUs, and GPU utilisation is only at 5-10% on avg. <150w per GPU.

I'm not sure how useful this is, but ran it over night. It had an error on -b 4096 pp8192 due to OOM but still feel it's useful!

model size params backend ngl n_batch n_ubatch fa mla amb fmoe test t/s
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 pp512 216.01 ± 4.70
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 pp1024 219.99 ± 2.45
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 pp2048 219.74 ± 1.46
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 pp4096 208.57 ± 0.58
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 pp8192 183.37 ± 0.73
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 tg128 17.22 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 tg256 17.84 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 tg512 18.06 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 tg1024 18.02 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 1024 1 tg2048 17.74 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 pp512 238.55 ± 2.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 pp1024 235.57 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 pp2048 226.29 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 pp4096 208.86 ± 0.10
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 pp8192 182.56 ± 0.39
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 tg128 17.23 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 tg256 17.87 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 tg512 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 tg1024 18.01 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 512 1 tg2048 17.75 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 pp512 239.67 ± 1.22
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 pp1024 235.22 ± 1.85
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 pp2048 225.73 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 pp4096 207.66 ± 0.12
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 pp8192 179.22 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 tg128 17.25 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 tg256 17.85 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 tg512 18.05 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 tg1024 18.04 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 128 1 tg2048 17.77 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 pp512 239.69 ± 0.92
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 pp1024 235.48 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 pp2048 224.92 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 pp4096 205.77 ± 0.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 pp8192 176.72 ± 0.14
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 tg128 17.21 ± 0.08
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 tg256 17.85 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 tg512 18.05 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 tg1024 18.04 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 64 1 tg2048 17.77 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 pp512 236.20 ± 0.76
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 pp1024 233.43 ± 0.95
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 pp2048 222.88 ± 0.17
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 pp4096 203.34 ± 0.16
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 pp8192 173.21 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 tg128 17.27 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 tg256 17.85 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 tg512 18.06 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 tg1024 18.02 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 512 1 2 32 1 tg2048 17.79 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 pp512 238.70 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 pp1024 303.92 ± 1.82
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 pp2048 295.71 ± 0.91
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 pp4096 276.63 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 pp8192 244.18 ± 0.26
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 tg128 17.26 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 tg256 17.79 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 tg512 18.09 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 tg1024 18.04 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 1024 1 tg2048 17.77 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 pp512 239.64 ± 1.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 pp1024 305.79 ± 0.40
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 pp2048 296.58 ± 0.75
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 pp4096 276.62 ± 0.54
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 pp8192 244.26 ± 0.31
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 tg128 17.27 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 tg256 17.88 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 tg512 18.09 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 tg1024 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 512 1 tg2048 17.70 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 pp512 238.73 ± 1.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 pp1024 304.83 ± 0.61
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 pp2048 295.23 ± 0.09
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 pp4096 275.28 ± 0.29
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 pp8192 239.76 ± 0.39
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 tg128 17.21 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 tg256 17.82 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 tg512 18.05 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 tg1024 18.01 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 128 1 tg2048 17.71 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 pp512 237.98 ± 0.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 pp1024 304.20 ± 0.22
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 pp2048 293.80 ± 1.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 pp4096 272.19 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 pp8192 235.64 ± 0.42
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 tg128 17.14 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 tg256 17.79 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 tg512 18.02 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 tg1024 18.00 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 64 1 tg2048 17.72 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 pp512 238.40 ± 1.47
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 pp1024 301.66 ± 1.64
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 pp2048 290.44 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 pp4096 267.12 ± 0.09
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 pp8192 229.98 ± 0.19
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 tg128 17.16 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 tg256 17.76 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 tg512 18.01 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 tg1024 17.97 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 1024 1 2 32 1 tg2048 17.73 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 pp512 240.23 ± 1.70
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 pp1024 305.03 ± 0.60
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 pp2048 349.22 ± 0.37
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 pp4096 327.33 ± 0.82
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 pp8192 290.90 ± 0.26
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 tg128 17.21 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 tg256 17.84 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 tg512 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 tg1024 18.01 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 1024 1 tg2048 17.74 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 pp512 239.12 ± 3.60
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 pp1024 305.13 ± 1.86
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 pp2048 349.84 ± 0.12
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 pp4096 328.46 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 pp8192 290.47 ± 0.23
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 tg128 17.24 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 tg256 17.81 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 tg512 18.02 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 tg1024 18.04 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 512 1 tg2048 17.79 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 pp512 238.52 ± 1.44
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 pp1024 304.77 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 pp2048 348.11 ± 0.69
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 pp4096 326.30 ± 0.69
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 pp8192 288.35 ± 0.12
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 tg128 17.24 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 tg256 17.88 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 tg512 18.07 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 tg1024 18.05 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 128 1 tg2048 17.77 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 pp512 238.42 ± 1.40
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 pp1024 304.32 ± 1.66
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 pp2048 344.70 ± 1.92
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 pp4096 323.64 ± 0.60
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 pp8192 283.02 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 tg128 17.22 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 tg256 17.86 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 tg512 18.06 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 tg1024 18.06 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 64 1 tg2048 17.79 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 pp512 236.64 ± 1.54
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 pp1024 301.44 ± 1.56
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 pp2048 343.13 ± 0.36
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 pp4096 317.60 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 pp8192 274.27 ± 0.22
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 tg128 17.28 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 tg256 17.89 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 tg512 18.08 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 tg1024 18.05 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 2048 1 2 32 1 tg2048 17.78 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 pp512 238.37 ± 1.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 pp1024 304.95 ± 1.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 pp2048 349.14 ± 0.52
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 pp4096 327.89 ± 0.19
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 pp8192 291.05 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 tg128 17.25 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 tg256 17.81 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 tg512 18.06 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 tg1024 18.04 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 1024 1 tg2048 17.78 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 pp512 238.06 ± 0.70
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 pp1024 304.73 ± 0.74
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 pp2048 348.72 ± 1.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 pp4096 328.20 ± 0.51
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 pp8192 290.87 ± 0.49
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 tg128 17.27 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 tg256 17.88 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 tg512 18.09 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 tg1024 18.04 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 512 1 tg2048 17.72 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 pp512 239.80 ± 0.46
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 pp1024 306.38 ± 1.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 pp2048 348.17 ± 0.55
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 pp4096 325.50 ± 0.88
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 pp8192 288.20 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 tg128 17.25 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 tg256 17.83 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 tg512 18.10 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 tg1024 18.06 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 128 1 tg2048 17.76 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 pp512 237.92 ± 2.32
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 pp1024 304.37 ± 0.47
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 pp2048 347.09 ± 0.66
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 pp4096 323.48 ± 0.46
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 pp8192 283.28 ± 0.14
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 tg128 17.20 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 tg256 17.86 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 tg512 18.05 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 tg1024 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 64 1 tg2048 17.78 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 pp512 238.77 ± 2.73
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 pp1024 302.54 ± 0.90
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 pp2048 342.62 ± 0.56
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 pp4096 317.58 ± 0.10
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 pp8192 274.23 ± 0.40
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 tg128 17.27 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 tg256 17.88 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 tg512 18.09 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 tg1024 17.98 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 2048 4096 1 2 32 1 tg2048 17.78 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 pp512 240.30 ± 2.99
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 pp1024 236.20 ± 1.81
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 pp2048 226.46 ± 0.49
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 pp4096 209.52 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 pp8192 183.03 ± 0.23
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 tg128 17.24 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 tg256 17.89 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 tg512 18.08 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 tg1024 18.06 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 1024 1 tg2048 17.77 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 pp512 238.21 ± 0.99
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 pp1024 236.32 ± 1.53
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 pp2048 225.41 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 pp4096 209.14 ± 0.30
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 pp8192 182.42 ± 0.08
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 tg128 17.24 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 tg256 17.86 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 tg512 18.09 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 tg1024 18.06 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 512 1 tg2048 17.78 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 pp512 239.31 ± 0.11
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 pp1024 234.58 ± 0.88
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 pp2048 224.77 ± 0.60
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 pp4096 207.35 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 pp8192 178.79 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 tg128 17.26 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 tg256 17.88 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 tg512 18.07 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 tg1024 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 128 1 tg2048 17.78 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 pp512 239.12 ± 0.21
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 pp1024 235.30 ± 1.41
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 pp2048 224.94 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 pp4096 206.20 ± 0.28
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 pp8192 176.54 ± 0.17
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 tg128 17.29 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 tg256 17.86 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 tg512 18.07 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 tg1024 17.99 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 64 1 tg2048 17.72 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 pp512 238.94 ± 0.70
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 pp1024 233.23 ± 0.45
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 pp2048 222.40 ± 0.23
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 pp4096 203.04 ± 0.51
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 pp8192 173.09 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 tg128 17.25 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 tg256 17.89 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 tg512 18.06 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 tg1024 18.04 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 512 1 2 32 1 tg2048 17.76 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 pp512 239.80 ± 0.48
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 pp1024 305.07 ± 0.33
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 pp2048 295.09 ± 0.13
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 pp4096 275.70 ± 0.25
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 pp8192 243.52 ± 0.27
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 tg128 17.25 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 tg256 17.87 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 tg512 18.03 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 tg1024 17.97 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 1024 1 tg2048 17.72 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 pp512 241.05 ± 0.59
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 pp1024 304.85 ± 1.84
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 pp2048 295.04 ± 0.48
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 pp4096 276.20 ± 0.08
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 pp8192 243.36 ± 0.27
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 tg128 17.17 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 tg256 17.79 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 tg512 18.00 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 tg1024 17.98 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 512 1 tg2048 17.76 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 pp512 238.47 ± 0.34
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 pp1024 305.42 ± 1.32
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 pp2048 295.28 ± 0.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 pp4096 274.18 ± 0.37
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 pp8192 239.55 ± 0.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 tg128 17.27 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 tg256 17.85 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 tg512 17.99 ± 0.06
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 tg1024 18.04 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 128 1 tg2048 17.77 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 pp512 239.49 ± 0.90
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 pp1024 303.09 ± 1.76
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 pp2048 292.21 ± 1.47
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 pp4096 271.27 ± 0.16
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 pp8192 234.84 ± 0.11
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 tg128 17.23 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 tg256 17.83 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 tg512 18.06 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 tg1024 18.05 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 64 1 tg2048 17.73 ± 0.05
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 pp512 238.09 ± 1.33
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 pp1024 302.10 ± 0.35
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 pp2048 289.34 ± 0.51
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 pp4096 266.76 ± 0.16
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 pp8192 229.52 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 tg128 17.29 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 tg256 17.80 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 tg512 18.07 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 tg1024 18.04 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 1024 1 2 32 1 tg2048 17.74 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 pp512 239.40 ± 0.85
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 pp1024 304.81 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 pp2048 348.47 ± 1.08
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 pp4096 327.77 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 pp8192 290.58 ± 0.18
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 tg128 17.26 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 tg256 17.86 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 tg512 18.08 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 tg1024 18.01 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 1024 1 tg2048 17.67 ± 0.11
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 pp512 239.10 ± 1.34
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 pp1024 304.24 ± 2.13
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 pp2048 348.34 ± 0.82
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 pp4096 327.32 ± 0.20
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 pp8192 290.58 ± 0.09
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 tg128 17.27 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 tg256 17.83 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 tg512 18.06 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 tg1024 18.04 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 512 1 tg2048 17.71 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 pp512 239.16 ± 0.38
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 pp1024 304.15 ± 0.87
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 pp2048 347.30 ± 0.52
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 pp4096 325.70 ± 0.67
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 pp8192 287.87 ± 0.21
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 tg128 17.20 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 tg256 17.82 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 tg512 18.04 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 tg1024 18.01 ± 0.00
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 128 1 tg2048 17.72 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 pp512 240.31 ± 3.17
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 pp1024 303.77 ± 1.31
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 pp2048 346.19 ± 0.76
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 pp4096 323.25 ± 0.24
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 pp8192 282.42 ± 0.07
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 tg128 17.18 ± 0.12
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 tg256 17.79 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 tg512 17.99 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 tg1024 18.02 ± 0.02
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 64 1 tg2048 17.78 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 pp512 237.68 ± 1.86
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 pp1024 302.20 ± 1.45
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 pp2048 342.06 ± 0.96
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 pp4096 317.32 ± 0.50
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 pp8192 273.87 ± 0.54
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 tg128 17.28 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 tg256 17.85 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 tg512 18.03 ± 0.03
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 tg1024 18.04 ± 0.04
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 2048 1 2 32 1 tg2048 17.77 ± 0.01
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 4096 1 2 1024 1 pp512 238.93 ± 0.91
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 4096 1 2 1024 1 pp1024 305.36 ± 0.21
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 4096 1 2 1024 1 pp2048 348.42 ± 0.27
deepseek2 671B Q8_0 307.20 GiB 672.05 B CUDA 63 4096 4096 1 2 1024 1 pp4096 346.42 ± 0.52

Feel free to create whichever interesting graphs you find from it, as there's a lot of data it's quite hard to isolate:

PP

Image Image Image

TG shows no notable difference.


💬 Conversation

👤 davidsyoung commented the 2025-03-18 at 09:37:29:

Mixed quant of Q8 for attn, Q5 down / IQ4_XS up|gate for layers 3-8, and IQ4_XS down / IQ3_S up|gate.

Component Blocks 0-2 Blocks 3-8 Blocks 9-60
Attention Query/Key/Value q8_0 q8_0 q8_0
Attention Output q8_0 q8_0 q8_0
FFN Down (regular) q8_0 - -
FFN Gate/Up (regular) q8_0 - -
FFN Down Shared Experts - q5_K q5_K
FFN Gate/Up Shared Experts - q5_K q5_K
FFN Down Experts - q5_K iq4_xs
FFN Gate/Up Experts - iq4_xs iq3_s
Output Layer q8_0 q8_0 q8_0
Compression Results
Original size: 1,282,038 MB (~1.2 TB)
Quantized size: 314,569 MB (~307 GB)
Compression ratio: 4.1x

PPL

perplexity: tokenizing the input ..
perplexity: tokenization took 1195.26 ms
perplexity: calculating perplexity over 561 chunks, n_ctx=512, batch_size=2048, n_seq=4
perplexity: 11.69 seconds per pass - ETA 27.32 minutes
[1]2.5779,[2]3.3447,[3]2.4073,[4]2.0140,[5]1.8352,[6]1.6862,[7]1.5895,[8]1.5208,[9]1.4715,[10]1.4284,[11]1.4147,[12]1.4406,[13]1.4529,[14]1.5824,[15]1.7144,[16]1.7752,[17]1.9408,[18]2.0703,[19]2.0333,[20]2.0250,[21]2.1305,[22]2.1021,[23]2.0764,[24]2.0880,[25]2.0581,[26]2.0330,[27]2.0797,[28]2.0888,[29]2.1391,[30]2.1698,[31]2.2044,[32]2.2227,[33]2.2626,[34]2.3049,[35]2.3566,[36]2.4115,[37]2.4463,[38]2.4930,[39]2.5346,[40]2.5926,[41]2.6353,[42]2.6458,[43]2.6948,[44]2.7107,[45]2.7909,[46]2.8420,[47]2.8003,[48]2.7549,[49]2.7298,[50]2.7498,[51]2.7964,[52]2.8105,[53]2.8597,[54]2.8734,[55]2.9047,[56]2.9384,[57]2.9550,[58]2.9926,[59]3.0027,[60]3.0502,[61]3.0906,[62]3.1475,[63]3.1812,[64]3.2262,[65]3.2360,[66]3.2179,[67]3.1954,[68]3.2271,[69]3.2225,[70]3.2377,[71]3.2562,[72]3.2726,[73]3.2860,[74]3.3095,[75]3.2881,[76]3.2396,[77]3.1959,[78]3.1931,[79]3.1728,[80]3.1563,[81]3.1190,[82]3.1220,[83]3.0918,[84]3.0554,[85]3.0218,[86]2.9995,[87]2.9958,[88]2.9686,[89]2.9537,[90]2.9261,[91]2.8966,[92]2.8704,[93]2.8441,[94]2.8196,[95]2.7964,[96]2.7947,[97]2.8024,[98]2.7882,[99]2.7728,[100]2.7752,[101]2.7671,[102]2.7843,[103]2.8105,[104]2.8288,[105]2.8261,[106]2.8486,[107]2.8737,[108]2.8953,[109]2.9296,[110]2.9637,[111]2.9837,[112]2.9567,[113]2.9436,[114]2.9207,[115]2.9047,[116]2.8905,[117]2.8672,[118]2.8450,[119]2.8235,[120]2.8040,[121]2.7884,[122]2.7698,[123]2.7532,[124]2.7334,[125]2.7156,[126]2.6981,[127]2.6840,[128]2.6757,[129]2.6662,[130]2.6551,[131]2.6472,[132]2.6548,[133]2.6649,[134]2.6714,[135]2.6822,[136]2.6990,[137]2.7145,[138]2.7231,[139]2.7348,[140]2.7353,[141]2.7368,[142]2.7356,[143]2.7359,[144]2.7320,[145]2.7228,[146]2.7211,[147]2.7254,[148]2.7248,[149]2.7265,[150]2.7210,[151]2.7192,[152]2.7157,[153]2.7114,[154]2.7119,[155]2.7159,[156]2.7180,[157]2.7237,[158]2.7322,[159]2.7339,[160]2.7428,[161]2.7509,[162]2.7605,[163]2.7660,[164]2.7863,[165]2.8095,[166]2.8270,[167]2.8399,[168]2.8647,[169]2.8872,[170]2.9083,[171]2.9311,[172]2.9150,[173]2.8980,[174]2.8843,[175]2.8712,[176]2.8589,[177]2.8467,[178]2.8338,[179]2.8193,[180]2.8228,[181]2.8370,[182]2.8519,[183]2.8669,[184]2.8813,[185]2.8915,[186]2.9083,[187]2.9241,[188]2.9381,[189]2.9489,[190]2.9490,[191]2.9561,[192]2.9601,[193]2.9652,[194]2.9848,[195]2.9935,[196]3.0068,[197]3.0167,[198]3.0211,[199]3.0267,[200]3.0261,[201]3.0415,[202]3.0361,[203]3.0413,[204]3.0446,[205]3.0447,[206]3.0468,[207]3.0552,[208]3.0645,[209]3.0737,[210]3.0738,[211]3.0688,[212]3.0689,[213]3.0765,[214]3.0781,[215]3.0837,[216]3.0847,[217]3.0805,[218]3.0804,[219]3.0811,[220]3.0800,[221]3.0803,[222]3.0803,[223]3.0805,[224]3.0856,[225]3.0871,[226]3.0791,[227]3.0772,[228]3.0792,[229]3.0835,[230]3.0900,[231]3.0962,[232]3.0880,[233]3.0801,[234]3.0803,[235]3.0787,[236]3.0879,[237]3.0957,[238]3.1050,[239]3.1151,[240]3.1241,[241]3.1353,[242]3.1498,[243]3.1632,[244]3.1713,[245]3.1831,[246]3.1937,[247]3.1927,[248]3.1884,[249]3.1867,[250]3.1804,[251]3.1782,[252]3.1805,[253]3.1841,[254]3.1910,[255]3.1971,[256]3.2005,[257]3.2032,[258]3.2042,[259]3.2076,[260]3.2098,[261]3.2107,[262]3.2099,[263]3.2158,[264]3.2179,[265]3.2182,[266]3.2199,[267]3.2230,[268]3.2267,[269]3.2298,[270]3.2290,[271]3.2271,[272]3.2205,[273]3.2208,[274]3.2143,[275]3.2037,[276]3.1934,[277]3.1951,[278]3.2052,[279]3.2115,[280]3.2195,[281]3.2272,[282]3.2333,[283]3.2398,[284]3.2466,[285]3.2603,[286]3.2626,[287]3.2661,[288]3.2707,[289]3.2732,[290]3.2648,[291]3.2557,[292]3.2544,[293]3.2536,[294]3.2513,[295]3.2487,[296]3.2507,[297]3.2513,[298]3.2562,[299]3.2620,[300]3.2651,[301]3.2691,[302]3.2713,[303]3.2734,[304]3.2726,[305]3.2845,[306]3.2922,[307]3.3033,[308]3.2916,[309]3.2865,[310]3.2769,[311]3.2804,[312]3.2825,[313]3.2893,[314]3.2915,[315]3.2946,[316]3.2959,[317]3.2974,[318]3.2979,[319]3.2982,[320]3.3026,[321]3.3028,[322]3.3042,[323]3.3106,[324]3.3112,[325]3.3167,[326]3.3214,[327]3.3255,[328]3.3282,[329]3.3297,[330]3.3360,[331]3.3396,[332]3.3443,[333]3.3428,[334]3.3425,[335]3.3428,[336]3.3429,[337]3.3437,[338]3.3441,[339]3.3466,[340]3.3502,[341]3.3555,[342]3.3649,[343]3.3744,[344]3.3797,[345]3.3713,[346]3.3640,[347]3.3597,[348]3.3523,[349]3.3488,[350]3.3471,[351]3.3521,[352]3.3671,[353]3.3761,[354]3.3892,[355]3.3977,[356]3.4029,[357]3.4148,[358]3.4246,[359]3.4279,[360]3.4346,[361]3.4439,[362]3.4526,[363]3.4586,[364]3.4649,[365]3.4715,[366]3.4822,[367]3.4909,[368]3.4975,[369]3.5054,[370]3.5138,[371]3.5277,[372]3.5368,[373]3.5401,[374]3.5435,[375]3.5485,[376]3.5616,[377]3.5727,[378]3.5754,[379]3.5749,[380]3.5715,[381]3.5762,[382]3.5816,[383]3.5853,[384]3.5894,[385]3.5931,[386]3.5996,[387]3.6055,[388]3.6087,[389]3.5980,[390]3.5883,[391]3.5774,[392]3.5715,[393]3.5623,[394]3.5535,[395]3.5438,[396]3.5336,[397]3.5245,[398]3.5146,[399]3.5042,[400]3.4963,[401]3.4863,[402]3.4756,[403]3.4668,[404]3.4563,[405]3.4465,[406]3.4364,[407]3.4270,[408]3.4178,[409]3.4090,[410]3.4031,[411]3.4038,[412]3.3993,[413]3.4012,[414]3.4038,[415]3.4009,[416]3.4009,[417]3.4034,[418]3.3979,[419]3.3991,[420]3.3966,[421]3.3953,[422]3.3970,[423]3.3964,[424]3.4006,[425]3.4005,[426]3.4009,[427]3.3997,[428]3.4021,[429]3.4037,[430]3.4064,[431]3.4074,[432]3.4064,[433]3.4027,[434]3.4028,[435]3.3956,[436]3.3891,[437]3.3851,[438]3.3833,[439]3.3805,[440]3.3855,[441]3.3905,[442]3.3979,[443]3.3964,[444]3.3972,[445]3.3983,[446]3.4029,[447]3.4058,[448]3.4083,[449]3.4114,[450]3.4154,[451]3.4184,[452]3.4206,[453]3.4223,[454]3.4208,[455]3.4229,[456]3.4232,[457]3.4257,[458]3.4311,[459]3.4317,[460]3.4318,[461]3.4284,[462]3.4322,[463]3.4396,[464]3.4448,[465]3.4381,[466]3.4361,[467]3.4344,[468]3.4355,[469]3.4328,[470]3.4301,[471]3.4304,[472]3.4311,[473]3.4304,[474]3.4295,[475]3.4308,[476]3.4290,[477]3.4282,[478]3.4288,[479]3.4307,[480]3.4334,[481]3.4290,[482]3.4325,[483]3.4316,[484]3.4353,[485]3.4416,[486]3.4444,[487]3.4479,[488]3.4531,[489]3.4555,[490]3.4603,[491]3.4665,[492]3.4709,[493]3.4707,[494]3.4719,[495]3.4746,[496]3.4764,[497]3.4794,[498]3.4798,[499]3.4790,[500]3.4832,[501]3.4877,[502]3.4865,[503]3.4849,[504]3.4871,[505]3.4905,[506]3.4988,[507]3.5016,[508]3.5050,[509]3.4973,[510]3.4914,[511]3.4851,[512]3.4810,[513]3.4750,[514]3.4738,[515]3.4761,[516]3.4714,[517]3.4713,[518]3.4704,[519]3.4710,[520]3.4755,[521]3.4744,[522]3.4730,[523]3.4790,[524]3.4775,[525]3.4761,[526]3.4715,[527]3.4663,[528]3.4628,[529]3.4599,[530]3.4568,[531]3.4536,[532]3.4479,[533]3.4415,[534]3.4370,[535]3.4382,[536]3.4410,[537]3.4443,[538]3.4469,[539]3.4496,[540]3.4550,[541]3.4584,[542]3.4607,[543]3.4552,[544]3.4512,[545]3.4508,[546]3.4440,[547]3.4374,[548]3.4307,[549]3.4240,[550]3.4178,[551]3.4116,[552]3.4060,[553]3.4002,[554]3.3983,[555]3.3970,[556]3.3998,[557]3.4039,[558]3.4098,[559]3.4145,[560]3.4197,[561]3.4178,
Final estimate: PPL = 3.4178 +/- 0.01891

👤 davidsyoung commented the 2025-03-18 at 09:37:29:

Mixed quant of Q8 for attn, Q5 down / IQ4_XS up|gate for layers 3-8, and IQ4_XS down / IQ3_S up|gate.

Component Blocks 0-2 Blocks 3-8 Blocks 9-60
Attention Query/Key/Value q8_0 q8_0 q8_0
Attention Output q8_0 q8_0 q8_0
FFN Down (regular) q8_0 - -
FFN Gate/Up (regular) q8_0 - -
FFN Down Shared Experts - q5_K q5_K
FFN Gate/Up Shared Experts - q5_K q5_K
FFN Down Experts - q5_K iq4_xs
FFN Gate/Up Experts - iq4_xs iq3_s
Output Layer q8_0 q8_0 q8_0

PPL

perplexity: tokenizing the input ..
perplexity: tokenization took 1195.26 ms
perplexity: calculating perplexity over 561 chunks, n_ctx=512, batch_size=2048, n_seq=4
perplexity: 11.69 seconds per pass - ETA 27.32 minutes
[1]2.5779,[2]3.3447,[3]2.4073,[4]2.0140,[5]1.8352,[6]1.6862,[7]1.5895,[8]1.5208,[9]1.4715,[10]1.4284,[11]1.4147,[12]1.4406,[13]1.4529,[14]1.5824,[15]1.7144,[16]1.7752,[17]1.9408,[18]2.0703,[19]2.0333,[20]2.0250,[21]2.1305,[22]2.1021,[23]2.0764,[24]2.0880,[25]2.0581,[26]2.0330,[27]2.0797,[28]2.0888,[29]2.1391,[30]2.1698,[31]2.2044,[32]2.2227,[33]2.2626,[34]2.3049,[35]2.3566,[36]2.4115,[37]2.4463,[38]2.4930,[39]2.5346,[40]2.5926,[41]2.6353,[42]2.6458,[43]2.6948,[44]2.7107,[45]2.7909,[46]2.8420,[47]2.8003,[48]2.7549,[49]2.7298,[50]2.7498,[51]2.7964,[52]2.8105,[53]2.8597,[54]2.8734,[55]2.9047,[56]2.9384,[57]2.9550,[58]2.9926,[59]3.0027,[60]3.0502,[61]3.0906,[62]3.1475,[63]3.1812,[64]3.2262,[65]3.2360,[66]3.2179,[67]3.1954,[68]3.2271,[69]3.2225,[70]3.2377,[71]3.2562,[72]3.2726,[73]3.2860,[74]3.3095,[75]3.2881,[76]3.2396,[77]3.1959,[78]3.1931,[79]3.1728,[80]3.1563,[81]3.1190,[82]3.1220,[83]3.0918,[84]3.0554,[85]3.0218,[86]2.9995,[87]2.9958,[88]2.9686,[89]2.9537,[90]2.9261,[91]2.8966,[92]2.8704,[93]2.8441,[94]2.8196,[95]2.7964,[96]2.7947,[97]2.8024,[98]2.7882,[99]2.7728,[100]2.7752,[101]2.7671,[102]2.7843,[103]2.8105,[104]2.8288,[105]2.8261,[106]2.8486,[107]2.8737,[108]2.8953,[109]2.9296,[110]2.9637,[111]2.9837,[112]2.9567,[113]2.9436,[114]2.9207,[115]2.9047,[116]2.8905,[117]2.8672,[118]2.8450,[119]2.8235,[120]2.8040,[121]2.7884,[122]2.7698,[123]2.7532,[124]2.7334,[125]2.7156,[126]2.6981,[127]2.6840,[128]2.6757,[129]2.6662,[130]2.6551,[131]2.6472,[132]2.6548,[133]2.6649,[134]2.6714,[135]2.6822,[136]2.6990,[137]2.7145,[138]2.7231,[139]2.7348,[140]2.7353,[141]2.7368,[142]2.7356,[143]2.7359,[144]2.7320,[145]2.7228,[146]2.7211,[147]2.7254,[148]2.7248,[149]2.7265,[150]2.7210,[151]2.7192,[152]2.7157,[153]2.7114,[154]2.7119,[155]2.7159,[156]2.7180,[157]2.7237,[158]2.7322,[159]2.7339,[160]2.7428,[161]2.7509,[162]2.7605,[163]2.7660,[164]2.7863,[165]2.8095,[166]2.8270,[167]2.8399,[168]2.8647,[169]2.8872,[170]2.9083,[171]2.9311,[172]2.9150,[173]2.8980,[174]2.8843,[175]2.8712,[176]2.8589,[177]2.8467,[178]2.8338,[179]2.8193,[180]2.8228,[181]2.8370,[182]2.8519,[183]2.8669,[184]2.8813,[185]2.8915,[186]2.9083,[187]2.9241,[188]2.9381,[189]2.9489,[190]2.9490,[191]2.9561,[192]2.9601,[193]2.9652,[194]2.9848,[195]2.9935,[196]3.0068,[197]3.0167,[198]3.0211,[199]3.0267,[200]3.0261,[201]3.0415,[202]3.0361,[203]3.0413,[204]3.0446,[205]3.0447,[206]3.0468,[207]3.0552,[208]3.0645,[209]3.0737,[210]3.0738,[211]3.0688,[212]3.0689,[213]3.0765,[214]3.0781,[215]3.0837,[216]3.0847,[217]3.0805,[218]3.0804,[219]3.0811,[220]3.0800,[221]3.0803,[222]3.0803,[223]3.0805,[224]3.0856,[225]3.0871,[226]3.0791,[227]3.0772,[228]3.0792,[229]3.0835,[230]3.0900,[231]3.0962,[232]3.0880,[233]3.0801,[234]3.0803,[235]3.0787,[236]3.0879,[237]3.0957,[238]3.1050,[239]3.1151,[240]3.1241,[241]3.1353,[242]3.1498,[243]3.1632,[244]3.1713,[245]3.1831,[246]3.1937,[247]3.1927,[248]3.1884,[249]3.1867,[250]3.1804,[251]3.1782,[252]3.1805,[253]3.1841,[254]3.1910,[255]3.1971,[256]3.2005,[257]3.2032,[258]3.2042,[259]3.2076,[260]3.2098,[261]3.2107,[262]3.2099,[263]3.2158,[264]3.2179,[265]3.2182,[266]3.2199,[267]3.2230,[268]3.2267,[269]3.2298,[270]3.2290,[271]3.2271,[272]3.2205,[273]3.2208,[274]3.2143,[275]3.2037,[276]3.1934,[277]3.1951,[278]3.2052,[279]3.2115,[280]3.2195,[281]3.2272,[282]3.2333,[283]3.2398,[284]3.2466,[285]3.2603,[286]3.2626,[287]3.2661,[288]3.2707,[289]3.2732,[290]3.2648,[291]3.2557,[292]3.2544,[293]3.2536,[294]3.2513,[295]3.2487,[296]3.2507,[297]3.2513,[298]3.2562,[299]3.2620,[300]3.2651,[301]3.2691,[302]3.2713,[303]3.2734,[304]3.2726,[305]3.2845,[306]3.2922,[307]3.3033,[308]3.2916,[309]3.2865,[310]3.2769,[311]3.2804,[312]3.2825,[313]3.2893,[314]3.2915,[315]3.2946,[316]3.2959,[317]3.2974,[318]3.2979,[319]3.2982,[320]3.3026,[321]3.3028,[322]3.3042,[323]3.3106,[324]3.3112,[325]3.3167,[326]3.3214,[327]3.3255,[328]3.3282,[329]3.3297,[330]3.3360,[331]3.3396,[332]3.3443,[333]3.3428,[334]3.3425,[335]3.3428,[336]3.3429,[337]3.3437,[338]3.3441,[339]3.3466,[340]3.3502,[341]3.3555,[342]3.3649,[343]3.3744,[344]3.3797,[345]3.3713,[346]3.3640,[347]3.3597,[348]3.3523,[349]3.3488,[350]3.3471,[351]3.3521,[352]3.3671,[353]3.3761,[354]3.3892,[355]3.3977,[356]3.4029,[357]3.4148,[358]3.4246,[359]3.4279,[360]3.4346,[361]3.4439,[362]3.4526,[363]3.4586,[364]3.4649,[365]3.4715,[366]3.4822,[367]3.4909,[368]3.4975,[369]3.5054,[370]3.5138,[371]3.5277,[372]3.5368,[373]3.5401,[374]3.5435,[375]3.5485,[376]3.5616,[377]3.5727,[378]3.5754,[379]3.5749,[380]3.5715,[381]3.5762,[382]3.5816,[383]3.5853,[384]3.5894,[385]3.5931,[386]3.5996,[387]3.6055,[388]3.6087,[389]3.5980,[390]3.5883,[391]3.5774,[392]3.5715,[393]3.5623,[394]3.5535,[395]3.5438,[396]3.5336,[397]3.5245,[398]3.5146,[399]3.5042,[400]3.4963,[401]3.4863,[402]3.4756,[403]3.4668,[404]3.4563,[405]3.4465,[406]3.4364,[407]3.4270,[408]3.4178,[409]3.4090,[410]3.4031,[411]3.4038,[412]3.3993,[413]3.4012,[414]3.4038,[415]3.4009,[416]3.4009,[417]3.4034,[418]3.3979,[419]3.3991,[420]3.3966,[421]3.3953,[422]3.3970,[423]3.3964,[424]3.4006,[425]3.4005,[426]3.4009,[427]3.3997,[428]3.4021,[429]3.4037,[430]3.4064,[431]3.4074,[432]3.4064,[433]3.4027,[434]3.4028,[435]3.3956,[436]3.3891,[437]3.3851,[438]3.3833,[439]3.3805,[440]3.3855,[441]3.3905,[442]3.3979,[443]3.3964,[444]3.3972,[445]3.3983,[446]3.4029,[447]3.4058,[448]3.4083,[449]3.4114,[450]3.4154,[451]3.4184,[452]3.4206,[453]3.4223,[454]3.4208,[455]3.4229,[456]3.4232,[457]3.4257,[458]3.4311,[459]3.4317,[460]3.4318,[461]3.4284,[462]3.4322,[463]3.4396,[464]3.4448,[465]3.4381,[466]3.4361,[467]3.4344,[468]3.4355,[469]3.4328,[470]3.4301,[471]3.4304,[472]3.4311,[473]3.4304,[474]3.4295,[475]3.4308,[476]3.4290,[477]3.4282,[478]3.4288,[479]3.4307,[480]3.4334,[481]3.4290,[482]3.4325,[483]3.4316,[484]3.4353,[485]3.4416,[486]3.4444,[487]3.4479,[488]3.4531,[489]3.4555,[490]3.4603,[491]3.4665,[492]3.4709,[493]3.4707,[494]3.4719,[495]3.4746,[496]3.4764,[497]3.4794,[498]3.4798,[499]3.4790,[500]3.4832,[501]3.4877,[502]3.4865,[503]3.4849,[504]3.4871,[505]3.4905,[506]3.4988,[507]3.5016,[508]3.5050,[509]3.4973,[510]3.4914,[511]3.4851,[512]3.4810,[513]3.4750,[514]3.4738,[515]3.4761,[516]3.4714,[517]3.4713,[518]3.4704,[519]3.4710,[520]3.4755,[521]3.4744,[522]3.4730,[523]3.4790,[524]3.4775,[525]3.4761,[526]3.4715,[527]3.4663,[528]3.4628,[529]3.4599,[530]3.4568,[531]3.4536,[532]3.4479,[533]3.4415,[534]3.4370,[535]3.4382,[536]3.4410,[537]3.4443,[538]3.4469,[539]3.4496,[540]3.4550,[541]3.4584,[542]3.4607,[543]3.4552,[544]3.4512,[545]3.4508,[546]3.4440,[547]3.4374,[548]3.4307,[549]3.4240,[550]3.4178,[551]3.4116,[552]3.4060,[553]3.4002,[554]3.3983,[555]3.3970,[556]3.3998,[557]3.4039,[558]3.4098,[559]3.4145,[560]3.4197,[561]3.4178,
Final estimate: PPL = 3.4178 +/- 0.01891

👤 ikawrakow commented the 2025-03-18 at 09:44:15:

Thank you for this. I think it can be really useful for people.


👤 saood06 commented the 2025-03-18 at 20:14:25:

@ikawrakow Can I convert this to a discussion?


👤 davidsyoung commented the 2025-03-18 at 20:19:37:

All good with me @saood06


👤 ikawrakow commented the 2025-03-18 at 20:29:32:

@ikawrakow Can I convert this to a discussion?

Sure, go ahead