7 Commits

Author SHA1 Message Date
turboderp
7f6459f259 MMLU eval: Add redux option 2026-04-18 18:30:25 +02:00
turboderp
5bb4e0d32b MMLU eval: Fix confidence interval 2026-04-04 22:49:51 +02:00
turboderp
ef8fd43d1c Cleanup unused imports 2025-11-16 14:25:46 +01:00
turboderp
38ddd8b9c5 MMLU: Fix prompt 2025-11-09 22:25:53 +01:00
turboderp
a6d79e5d0d MMLU: Random sample option 2025-07-12 21:14:56 +02:00
turboderp
415a55cc2d MMLU eval: More feedback during eval 2025-07-12 18:31:32 +02:00
turboderp
997ca85bcc Add MMLU eval 2025-07-11 13:55:07 +02:00