mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-03-14 18:37:23 +00:00
[refactor]: Change named 'KT-SFT' to 'kt-sft' (#1626)
* Change named 'KT-SFT' to 'kt-sft' * [docs]: update kt-sft name --------- Co-authored-by: ZiWei Yuan <yzwliam@126.com>
This commit is contained in:
12
README.md
12
README.md
@@ -8,12 +8,12 @@
|
||||
|
||||
</p>
|
||||
<h3>A Flexible Framework for Experiencing Cutting-edge LLM Inference/Fine-tune Optimizations</h3>
|
||||
<strong><a href="#-overview">🎯 Overview</a> | <a href="#-kt-kernel---high-performance-inference-kernels">🚀 kt-kernel</a> | <a href="#-kt-sft---fine-tuning-framework">🎓 KT-SFT</a> | <a href="#-citation">🔥 Citation</a> | <a href="https://github.com/kvcache-ai/ktransformers/issues/1582">🚀 Roadmap(2025Q4)</a> </strong>
|
||||
<strong><a href="#-overview">🎯 Overview</a> | <a href="#-kt-kernel---high-performance-inference-kernels">🚀 kt-kernel</a> | <a href="#-kt-sft---fine-tuning-framework">🎓 kt-sft</a> | <a href="#-citation">🔥 Citation</a> | <a href="https://github.com/kvcache-ai/ktransformers/issues/1582">🚀 Roadmap(2025Q4)</a> </strong>
|
||||
</div>
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](./kt-kernel/) and [KT-SFT](./KT-SFT/).
|
||||
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](./kt-kernel/) and [kt-sft](./kt-sft/).
|
||||
|
||||
## 🔥 Updates
|
||||
|
||||
@@ -79,7 +79,7 @@ pip install .
|
||||
|
||||
---
|
||||
|
||||
### 🎓 [KT-SFT](./KT-SFT/) - Fine-Tuning Framework
|
||||
### 🎓 [kt-sft](./kt-sft/) - Fine-Tuning Framework
|
||||
|
||||
KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning.
|
||||
|
||||
@@ -101,12 +101,12 @@ KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning
|
||||
|
||||
**Quick Start:**
|
||||
```bash
|
||||
cd KT-SFT
|
||||
# Install environment following KT-SFT/README.md
|
||||
cd kt-sft
|
||||
# Install environment following kt-sft/README.md
|
||||
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
|
||||
```
|
||||
|
||||
👉 **[Full Documentation →](./KT-SFT/README.md)**
|
||||
👉 **[Full Documentation →](./kt-sft/README.md)**
|
||||
|
||||
---
|
||||
|
||||
|
||||
12
README_ZH.md
12
README_ZH.md
@@ -8,12 +8,12 @@
|
||||
|
||||
</p>
|
||||
<h3>一个用于体验尖端 LLM 推理/微调优化的灵活框架</h3>
|
||||
<strong><a href="#-概览">🎯 概览</a> | <a href="#-kt-kernel---高性能推理内核">🚀 kt-kernel</a> | <a href="#-kt-sft---微调框架">🎓 KT-SFT</a> | <a href="#-引用">🔥 引用</a> </strong>
|
||||
<strong><a href="#-概览">🎯 概览</a> | <a href="#-kt-kernel---高性能推理内核">🚀 kt-kernel</a> | <a href="#-kt-sft---微调框架">🎓 kt-sft</a> | <a href="#-引用">🔥 引用</a> </strong>
|
||||
</div>
|
||||
|
||||
## 🎯 概览
|
||||
|
||||
KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [KT-SFT](./KT-SFT/)。
|
||||
KTransformers 是一个专注于通过 CPU-GPU 异构计算实现大语言模型高效推理和微调的研究项目。该项目已发展为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./kt-sft/)。
|
||||
|
||||
## 🔥 更新
|
||||
|
||||
@@ -78,7 +78,7 @@ pip install .
|
||||
|
||||
---
|
||||
|
||||
### 🎓 [KT-SFT](./KT-SFT/) - 微调框架
|
||||
### 🎓 [kt-sft](./kt-sft/) - 微调框架
|
||||
|
||||
KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
|
||||
|
||||
@@ -100,12 +100,12 @@ KTransformers × LLaMA-Factory 集成,用于超大型 MoE 模型微调。
|
||||
|
||||
**快速开始:**
|
||||
```bash
|
||||
cd KT-SFT
|
||||
# 按照 KT-SFT/README.md 安装环境
|
||||
cd kt-sft
|
||||
# 按照 kt-sft/README.md 安装环境
|
||||
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
|
||||
```
|
||||
|
||||
👉 **[完整文档 →](./KT-SFT/README.md)**
|
||||
👉 **[完整文档 →](./kt-sft/README.md)**
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](./kt-kernel/) and [KT-SFT](./KT-SFT/).
|
||||
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](./kt-kernel/) and [kt-sft](./kt-sft/).
|
||||
|
||||
## 🔥 Updates
|
||||
|
||||
@@ -67,7 +67,7 @@ pip install .
|
||||
|
||||
---
|
||||
|
||||
### 🎓 [KT-SFT](./KT-SFT/) - Fine-Tuning Framework
|
||||
### 🎓 [kt-sft](./kt-sft/) - Fine-Tuning Framework
|
||||
|
||||
KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning.
|
||||
|
||||
@@ -89,12 +89,12 @@ KTransformers × LLaMA-Factory integration for ultra-large MoE model fine-tuning
|
||||
|
||||
**Quick Start:**
|
||||
```bash
|
||||
cd KT-SFT
|
||||
# Install environment following KT-SFT/README.md
|
||||
cd kt-sft
|
||||
# Install environment following kt-sft/README.md
|
||||
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
|
||||
```
|
||||
|
||||
👉 **[Full Documentation →](./KT-SFT/README.md)**
|
||||
👉 **[Full Documentation →](./kt-sft/README.md)**
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
|
||||
## 🎯 项目概述
|
||||
|
||||
KTransformers 是一个专注于大语言模型高效推理和微调的研究项目,通过 CPU-GPU 异构计算实现资源受限环境下的模型部署。项目已演进为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [KT-SFT](./KT-SFT/)。
|
||||
KTransformers 是一个专注于大语言模型高效推理和微调的研究项目,通过 CPU-GPU 异构计算实现资源受限环境下的模型部署。项目已演进为**两个核心模块**:[kt-kernel](./kt-kernel/) 和 [kt-sft](./kt-sft/)。
|
||||
|
||||
## 🔥 更新
|
||||
|
||||
@@ -66,7 +66,7 @@ pip install .
|
||||
|
||||
---
|
||||
|
||||
### 🎓 [KT-SFT](./KT-SFT/) - 微调框架
|
||||
### 🎓 [kt-sft](./kt-sft/) - 微调框架
|
||||
|
||||
KTransformers × LLaMA-Factory 集成,支持超大 MoE 模型微调。
|
||||
|
||||
@@ -86,12 +86,12 @@ KTransformers × LLaMA-Factory 集成,支持超大 MoE 模型微调。
|
||||
|
||||
**快速开始:**
|
||||
```bash
|
||||
cd KT-SFT
|
||||
# 按照 KT-SFT/README.md 安装环境
|
||||
cd kt-sft
|
||||
# 按照 kt-sft/README.md 安装环境
|
||||
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
|
||||
```
|
||||
|
||||
👉 **[完整文档 →](./KT-SFT/README.md)**
|
||||
👉 **[完整文档 →](./kt-sft/README.md)**
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -3,11 +3,11 @@
|
||||
[Introduction](./README.md)
|
||||
# Install & Usage
|
||||
- [For kt-kernel](en/kt-kernel/kt-kernel_intro.md)
|
||||
- [For SFT](en/SFT/KTransformers-Fine-Tuning_User-Guide.md)
|
||||
- [For kt-sft](en/SFT/KTransformers-Fine-Tuning_User-Guide.md)
|
||||
|
||||
# Tutorial
|
||||
- [SFT part](en/SFT/README.md)
|
||||
- [SFT developer tech notes](en/SFT/KTransformers-Fine-Tuning_Developer-Technical-Notes.md)
|
||||
- [kt-sft part](en/SFT/README.md)
|
||||
- [kt-sft developer tech notes](en/SFT/KTransformers-Fine-Tuning_Developer-Technical-Notes.md)
|
||||
- [Injection Tutorial](en/SFT/injection_tutorial.md)
|
||||
<!-- - [Multi-GPU Tutorial](en/multi-gpu-tutorial.md) -->
|
||||
<!-- - [Use FP8 GPU Kernel](en/fp8_kernel.md) -->
|
||||
|
||||
@@ -1 +1 @@
|
||||
# SFT Docs
|
||||
# kt-sft Docs
|
||||
0
KT-SFT/.gitignore → kt-sft/.gitignore
vendored
0
KT-SFT/.gitignore → kt-sft/.gitignore
vendored
0
KT-SFT/.gitmodules → kt-sft/.gitmodules
vendored
0
KT-SFT/.gitmodules → kt-sft/.gitmodules
vendored
|
Before Width: | Height: | Size: 1.1 MiB After Width: | Height: | Size: 1.1 MiB |
@@ -35,7 +35,7 @@ gradtype = torch.bfloat16
|
||||
# torch.backends.cuda.matmul.allow_tf32 = False
|
||||
|
||||
import shutil
|
||||
folder_path = "/home/lpl/KT-SFT/debug"
|
||||
folder_path = "/home/lpl/kt-sft/debug"
|
||||
if os.path.exists(folder_path):
|
||||
shutil.rmtree(folder_path)
|
||||
os.makedirs(folder_path)
|
||||
@@ -650,13 +650,13 @@ def manual_check(experts_ids):
|
||||
|
||||
down_ba_ori = get_tensor(f"cpp_layer0_E_End{experts_idx}_down_ba_ori_", (expert_token_counts[experts_idx], intermediate_size))
|
||||
|
||||
# with open(f"/home/lpl/KT-SFT/debug/cpp_{experts_idx}_down_ba_ori_view.txt", "w") as f:
|
||||
# with open(f"/home/lpl/kt-sft/debug/cpp_{experts_idx}_down_ba_ori_view.txt", "w") as f:
|
||||
# f.write(str(down_ba_ori))
|
||||
|
||||
|
||||
down_output_grad = get_tensor(f"cpp_layer0_E_End{experts_idx}_down_output_grad_", (expert_token_counts[experts_idx], hidden_size))
|
||||
|
||||
# with open(f"/home/lpl/KT-SFT/debug/cpp_{experts_idx}_down_t_ba_ori_view.txt", "w") as f:
|
||||
# with open(f"/home/lpl/kt-sft/debug/cpp_{experts_idx}_down_t_ba_ori_view.txt", "w") as f:
|
||||
# f.write(str(down_output_grad))
|
||||
|
||||
|
||||
@@ -674,10 +674,10 @@ def manual_check(experts_ids):
|
||||
py_down_t_ba = torch.load(f"debug/py_layer0_E_End{experts_idx}_down_output_grad_.pt")
|
||||
py_down_ba = torch.load(f"debug/py_layer0_E_End{experts_idx}_gate_output_.pt")
|
||||
|
||||
# with open(f"/home/lpl/KT-SFT/debug/py_{experts_idx}_down_t_ba_ori_view.txt", "w") as f:
|
||||
# with open(f"/home/lpl/kt-sft/debug/py_{experts_idx}_down_t_ba_ori_view.txt", "w") as f:
|
||||
# f.write(str(py_down_t_ba))
|
||||
|
||||
# with open(f"/home/lpl/KT-SFT/debug/py_{experts_idx}_down_ba_ori_view.txt", "w") as f:
|
||||
# with open(f"/home/lpl/kt-sft/debug/py_{experts_idx}_down_ba_ori_view.txt", "w") as f:
|
||||
# f.write(str(py_down_ba))
|
||||
|
||||
print(f"cpp_{experts_idx}_down_ba_ori_:{down_ba_ori}")
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user