Commit Graph

769 Commits

Author SHA1 Message Date
Creeper-MZ
62c4023160 Fixed #1155 2025-04-17 10:21:51 -04:00
Yuhao Tsui
eff5bbc202 Merge branch 'kvcache-ai:main' into main 2025-04-17 22:01:31 +08:00
Creeper-MZ
4fb19bfcae Update chat.py 2025-04-17 09:19:14 -04:00
ZiWei Yuan
8770b6d573 Merge pull request #1159 from onepick/fix-rocm-build-error
Fix some build error for ROCM
2025-04-17 19:57:44 +08:00
onepick
6a7624fe4a Change the logic to build device since cuda is as default
Signed-off-by: onepick <jiajuku12@163.com>
2025-04-17 19:44:05 +08:00
Yuhao Tsui
8ce34b3b5c Modify the performance calculation module
Modify the performance data calculation module from estimation to retrieving from `raw_usage`.
2025-04-17 16:57:53 +08:00
wang jiahao
6e4da83d4b Merge pull request #978 from cyhasuka/main
Feat: Support Non-streaming chat in Ollama backend
2025-04-17 14:34:35 +08:00
wang jiahao
b055132369 Merge pull request #1154 from 344303947/features/add-function-calling
Fix the error caused by the client not passing temperature and top_p being empty
2025-04-17 14:31:02 +08:00
onepick
97f1995696 Fix some build error for ROCM
1. Fix terrible logic in CMakeLists.txt
2. using the correct typedef for hip

Signed-off-by: onepick <jiajuku12@163.com>
2025-04-17 11:34:33 +08:00
Creeper-MZ
cb266c98d4 Fix a bug 2025-04-16 23:31:33 -04:00
wang jiahao
3efb66213b Merge pull request #1157 from jiangshibiao/dev-fix-bug
Add bsz_tensors param to torch linear
2025-04-17 10:11:01 +08:00
Creeper-MZ
6bc2e85343 Update chat.py 2025-04-16 15:54:23 -04:00
Creeper-MZ
88f688e2c8 更改token注入逻辑,减少token注入量,防止遗忘
Update chat.py

Update chat.py

Update chat.py
2025-04-16 15:52:24 -04:00
root
921061666c fix some bugs 2025-04-17 00:48:09 +08:00
kevin
c8db24d5eb Update config.py
Update config.py
2025-04-16 17:32:08 +08:00
kevin
badf7a1bb1 Merge branch 'kvcache-ai:main' into features/add-function-calling 2025-04-16 17:21:27 +08:00
Chengyu Qiu
d2cf81423f Merge pull request #1135 from Creeper-MZ/function_call
Feat: Add Function call support
2025-04-16 09:57:22 +08:00
ZiWei Yuan
fcbd41e175 Merge pull request #1143 from jizhilong/improve-cmake-subprocess-output
feat(build): display limited tail of subprocesses in real time
2025-04-15 17:37:44 +08:00
jizhilong
0638ea298d feat(build): display limited tail of subprocesses in real time
this is a followup on #1108
2025-04-15 16:40:38 +08:00
ZiWei Yuan
8dc1ab9e04 Merge pull request #1108 from jizhilong/expose-cmake-logs
chore: show cmake output in real time during build_ext
2025-04-14 17:07:00 +08:00
sean.su
8699109129 Refactor the chat interface to support tool calling and parameter processing
Defined new data structures in chat.py to replace OpenAI's original implementation, adding support for tool calling.

Implemented logic for extracting and processing tool calls, enabling dynamic function invocation during conversations.

Added methods in balance_serve.py to retrieve sampling parameters, handling default values and edge cases.

Updated ktransformers.py and transformers.py to support the passing of tool parameters.

Modified the default value of top_p in config.py to 1.0 to increase generation diversity.

Extended the message model in chat.py to support the transmission of tool call information.

These changes enhance the system's flexibility and functionality, enabling more complex interaction patterns.
2025-04-14 15:23:37 +08:00
Creeper-MZ
a7e8d7c1af updata function_call 2025-04-13 23:48:51 -04:00
wang jiahao
038db30ec9 Merge pull request #1132 from wangkuigang-yewu-cmss/long-prompt-crash
使用长prompt时,避免rpc进程挂掉
2025-04-13 22:06:11 +08:00
wangkuigang-yewu-cmss
4538bdae97 prevent rpc process from crashing on long prompt
当prompt超过cache_len的时候,rpc进程会crash掉,导致整体不可用。
这里增加一个检查,让过长的prompt在请求早期就被提前过滤掉
2025-04-13 16:13:16 +08:00
ErvinXie
797dac7e31 Merge pull request #1109 from aubreyli/libxxhash-fPIC
xxHash: fix link error due to non-position-independent code
2025-04-13 14:15:31 +08:00
ZiWei Yuan
77956822ce Merge pull request #1116 from ikawrakow/ik/add_copyright
Add missing references to ik_llama.cpp
2025-04-13 11:53:12 +08:00
Iwan Kawrakow
99a247e167 Spelling 2025-04-11 10:15:42 +03:00
Iwan Kawrakow
c46b0c59d0 Add missing references to ik_llama.cpp 2025-04-11 09:39:57 +03:00
Aubrey Li
63ca2fa84d xxHash: fix link error due to non-position-independent code
Add PROPERTIES POSITION_INDEPENDENT_CODE option to fix the
following error:

/usr/bin/ld: ../../third_party/xxHash/libxxhash.a(xxhash.c.o):
relocation R_X86_64_32S against `.rodata' can not be used when
making a shared object; recompile with -fPIC

Trying to link a non-PIC static library libxxhash.a into a
.so shared library, which is not allowed. The object file
xxhash.c.o must be recompiled with explicit -fPIC support.
2025-04-10 21:50:23 +08:00
jizhilong
690d4d42f9 chore: show cmake output in real time during build_ext
otherwise cmake error messages may be suppressed, making debugging
difficult
2025-04-10 21:33:04 +08:00
Atream
35ba63e259 Merge pull request #1103 from kvcache-ai/Atream-patch-6
Create SECURITY.md
2025-04-09 19:50:57 +08:00
Atream
5f8cdc7640 Create SECURITY.md 2025-04-09 19:50:38 +08:00
Atream
92a67ab549 Merge pull request #1101 from kvcache-ai/Atream-patch-5
Update llama4.md
2025-04-09 19:23:46 +08:00
Atream
98dbdcd66c Update llama4.md 2025-04-09 19:23:35 +08:00
Atream
9fad782c1a Merge pull request #1100 from kvcache-ai/Atream-patch-4
Update llama4.md
2025-04-09 19:10:03 +08:00
Atream
346d202297 Update llama4.md 2025-04-09 19:09:44 +08:00
Atream
a46c43b2db Merge pull request #1099 from kvcache-ai/Atream-patch-3
Update llama4.md
2025-04-09 18:01:46 +08:00
Atream
d1fcb208cc Update llama4.md 2025-04-09 18:01:13 +08:00
Atream
0774fe4d62 Merge pull request #1098 from kvcache-ai/Atream-patch-2
Update llama4.md
2025-04-09 17:58:44 +08:00
Atream
ed2b971e02 Update llama4.md 2025-04-09 17:57:37 +08:00
Jianwei Dong
c689b23364 Merge pull request #1097 from kvcache-ai/update-llama4-tutorial
update llama4 tutorial
2025-04-09 17:40:58 +08:00
djw
26798500bd update llama4 tutorial 2025-04-09 09:40:08 +00:00
Jianwei Dong
1e0be68e51 Merge pull request #1096 from kvcache-ai/update-llama4-tutorial
update llama4 tutorial
2025-04-09 17:37:33 +08:00
djw
f73b4ca706 update llama4 tutorial 2025-04-09 09:36:30 +00:00
Jianwei Dong
2de96a1f05 Merge pull request #1095 from kvcache-ai/update-llama4-tutorial
update llama4 tutorial
2025-04-09 17:35:14 +08:00
djw
ecc3028c13 update llama4 tutorial 2025-04-09 09:34:04 +00:00
Azure
a74a58d864 Merge pull request #1091 from aubreyli/add_g++
balance_serve: Add g++ to compiler list
2025-04-09 14:40:30 +08:00
Yuhao Tsui
877aec858e Merge branch 'kvcache-ai:main' into main 2025-04-09 11:46:39 +08:00
Aubrey Li
45d20fa87b balance_serve: Add g++ to compiler list
In some OS distributions, g++ exists in the following form:

  # ls -l /usr/bin/g++*
  -rwxr-xr-x 4 root root 985784 Dec  9 12:51 /usr/bin/g++

So make sure to add g++ to the compiler list as well.
2025-04-09 11:25:35 +08:00
Atream
9037bf30d5 Merge pull request #1090 from kvcache-ai/Atream-patch-1
Update attention.py
2025-04-09 10:54:37 +08:00