Replies: 3 comments 1 reply
-
I have a branch which fixes the issue (you need to also fix the chat template), but it was not accepted as a solution. More context in the PR: #15124 The patch is really simple and I'm using it until a proper fix is merged into llama.cpp |
Beta Was this translation helpful? Give feedback.
-
A more complete (compared to #15124) solution is available here: #15158 Fixes a crash, a bunch of parsing stuff regarding channels (and reasoning) and the tool calling. Please do give it a try and leave feedback in the PR, make sure to test with the patch I shared at the end of the current discussions (#15158 (comment)) too, it has additional fixes which makes it work so far for me, but additional testing would be neat :) |
Beta Was this translation helpful? Give feedback.
-
The best PR so far is #15181, as it not only fixes special tokens appearing in the response, but also implements tool calling correctly! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm using llama-serve to serve GPT-OSS model, it runs fine but there are special tokens like <|start|>assistant<|channel|>final<|message|> in the response. Some coding tools (cline, roo code, kilo code) cannot parse it correctly. Is there any way to remove these tokens?
Beta Was this translation helpful? Give feedback.
All reactions