Files
ik_llama.cpp/github-data/pull_requests/585 - Special handling of Seed Coder FIM tokens.md
2025-07-23 13:31:53 +02:00

2.0 KiB

🔀 #585 - Special handling of Seed Coder FIM tokens

Author fizzAI
State Closed
Created 2025-07-04
Updated 2025-07-06

Description

Needed this for some quants and realized it didn't support it already, so figured I'd just PR upstream
Seems a bit odd to need to figure out model families by vocab size? But I'm not sure of a better way to do it, so left it as-is for now


💬 Conversation

👤 fizzAI commented the 2025-07-04 at 21:23:47:

Actually need to merge some tokenizer support from regular lcpp too, please hold lol


👤 fizzAI commented the 2025-07-04 at 22:43:32:

Appears to work, now


👤 ikawrakow submitted a review the 2025-07-05 at 09:29:56: 💬 COMMENTED


👤 ikawrakow commented during a code review the 2025-07-05 at 09:29:56 on convert_hf_to_gguf.py:

It is the only model that has a vocabulary of 155,136 tokens?


👤 ikawrakow commented during a code review the 2025-07-05 at 09:30:24 on include/llama.h:

Pleas format the same way as the surrounding code.


👤 ikawrakow commented during a code review the 2025-07-05 at 09:30:33 on src/llama.cpp:

Pleas format the same way as the surrounding code.


👤 ikawrakow submitted a review the 2025-07-05 at 09:30:54: APPROVED


👤 fizzAI submitted a review the 2025-07-05 at 19:35:38: 💬 COMMENTED


👤 fizzAI submitted a review the 2025-07-05 at 19:35:56: 💬 COMMENTED


👤 fizzAI commented during a code review the 2025-07-05 at 19:35:56 on include/llama.h:

D: damn my editor