2.0 KiB
🔀 #585 - Special handling of Seed Coder FIM tokens
| Author | fizzAI |
|---|---|
| State | ❌ Closed |
| Created | 2025-07-04 |
| Updated | 2025-07-06 |
Description
Needed this for some quants and realized it didn't support it already, so figured I'd just PR upstream
Seems a bit odd to need to figure out model families by vocab size? But I'm not sure of a better way to do it, so left it as-is for now
- I have read the contributing guidelines
- Self-reported review complexity:
- Low
- Medium
- High
💬 Conversation
👤 fizzAI commented the 2025-07-04 at 21:23:47:
Actually need to merge some tokenizer support from regular lcpp too, please hold lol
👤 fizzAI commented the 2025-07-04 at 22:43:32:
Appears to work, now
👤 ikawrakow submitted a review the 2025-07-05 at 09:29:56: 💬 COMMENTED
👤 ikawrakow commented during a code review the 2025-07-05 at 09:29:56 on convert_hf_to_gguf.py:
It is the only model that has a vocabulary of 155,136 tokens?
👤 ikawrakow commented during a code review the 2025-07-05 at 09:30:24 on include/llama.h:
Pleas format the same way as the surrounding code.
👤 ikawrakow commented during a code review the 2025-07-05 at 09:30:33 on src/llama.cpp:
Pleas format the same way as the surrounding code.
👤 ikawrakow submitted a review the 2025-07-05 at 09:30:54: ✅ APPROVED
👤 fizzAI submitted a review the 2025-07-05 at 19:35:38: 💬 COMMENTED
👤 fizzAI submitted a review the 2025-07-05 at 19:35:56: 💬 COMMENTED
👤 fizzAI commented during a code review the 2025-07-05 at 19:35:56 on include/llama.h:
D: damn my editor