Support for Llama-3-Nemotron models (#377)

* conflict resolution

* Changes to make work and add longrope support

* Changes to n_attention_wv rule

* Untested support of 253B

* DeciLMCausalModel now reads rope_theta from config.json properly

* Remove errant Granite mentions

* Better n_attention_vw rule

* Update vocab.py

---------

Co-authored-by: Yee Man Chan <ymchan@gmail.com>
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
saood06
2025-05-09 02:09:59 -05:00
committed by GitHub
parent 828758ec0d
commit 87bfad8437
6 changed files with 523 additions and 8 deletions

View File

@@ -230,7 +230,8 @@ extern "C" {
LLAMA_ROPE_SCALING_TYPE_NONE = 0,
LLAMA_ROPE_SCALING_TYPE_LINEAR = 1,
LLAMA_ROPE_SCALING_TYPE_YARN = 2,
LLAMA_ROPE_SCALING_TYPE_MAX_VALUE = LLAMA_ROPE_SCALING_TYPE_YARN,
LLAMA_ROPE_SCALING_TYPE_LONGROPE = 3,
LLAMA_ROPE_SCALING_TYPE_MAX_VALUE = LLAMA_ROPE_SCALING_TYPE_LONGROPE,
};
enum llama_pooling_type {