-
-
Notifications
You must be signed in to change notification settings - Fork 127
feat: Add XML tool call processing system for GLM-4.5 and Qwen3-Coder models #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add BaseXMLToolCallProcessor abstract class for extensible XML processing - Implement GLM45ToolCallProcessor for GLM-4.5 specific XML format - Add XMLToolCallProcessorFactory with factory pattern for processor creation - Extend TemplateMetadata with XML support fields (tool_call_format, xml_processor_type) - Enhance ToolCallProcessor with XML routing and from_xml() method - Fix Tool class import issues in tools.py - Add comprehensive test suite with 16 passing tests - Add complete implementation documentation This enables GLM-4.5 models to work with TabbyAPI's tool calling system by converting their XML format to OpenAI JSON format seamlessly. The system is backward compatible and provides a foundation for other XML-based models.
GLM 4.5 chat template {#- TabbyAPI-compatible GLM 4.5 template with XML tool call processing -#}
{#- XML Tool Call Processing Configuration -#}
{%- set stop_strings = ["<|user|>", "<|assistant|>", "<|observation|>", "<|system|>"] -%}
{%- set tool_start = "<tool_call>" -%}
{%- set tool_end = "</tool_call>" -%}
{%- set tool_call_format = "xml" -%}
{%- set xml_processor_type = "glm45" -%}
[gMASK]<sop>
{%- if tools -%}
<|system|>
# Tools
You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{% for tool in tools %}
{{ tool | tojson }}
{% endfor %}
</tools>
For each function call, output the function name and arguments within the following XML format:
<tool_call>{function-name}
<arg_key>{arg-key-1}</arg_key>
<arg_value>{arg-value-1}</arg_value>
<arg_key>{arg-key-2}</arg_key>
<arg_value>{arg-value-2}</arg_value>
...
</tool_call>{%- endif -%}
{%- macro visible_text(content) -%}
{%- if content is string -%}
{{- content }}
{%- elif content is iterable and content is not mapping -%}
{%- for item in content -%}
{%- if item is mapping and item.type == 'text' -%}
{{- item.text }}
{%- elif item is string -%}
{{- item }}
{%- endif -%}
{%- endfor -%}
{%- else -%}
{{- content }}
{%- endif -%}
{%- endmacro -%}
{%- set ns = namespace(last_user_index=-1) %}
{%- for m in messages %}
{%- if m.role == 'user' %}
{% set ns.last_user_index = loop.index0 -%}
{%- endif %}
{%- endfor %}
{% for m in messages %}
{%- if m.role == 'user' -%}<|user|>
{{ visible_text(m.content) }}
{{- '/nothink' if (enable_thinking is defined and not enable_thinking and not visible_text(m.content).endswith("/nothink")) else '' -}}
{%- elif m.role == 'assistant' -%}
<|assistant|>
{%- set reasoning_content = '' %}
{%- set content = visible_text(m.content) %}
{%- if m.reasoning_content is string %}
{%- set reasoning_content = m.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_user_index and reasoning_content -%}
{{ '\n<think>' + reasoning_content.strip() + '</think>'}}
{%- else -%}
{{ '\n<think></think>' }}
{%- endif -%}
{%- if content.strip() -%}
{{ '\n' + content.strip() }}
{%- endif -%}
{# When rendering the assistant’s tool_calls, support both string and mapping #}
{%- if m.role == 'assistant' and m.tool_calls -%}
{%- for tc in m.tool_calls if tc.type == 'function' -%}
<tool_call>{{ tc.function.name }}
{%- set _raw_args = tc.function.arguments %}
{%- if _raw_args is mapping -%}
{%- for k, v in _raw_args.items() -%}
<arg_key>{{ k }}</arg_key>
<arg_value>{{ v | tojson if v is not string else v }}</arg_value>
{%- endfor -%}
{%- elif _raw_args is string -%}
<arg_key>__raw__</arg_key>
<arg_value>{{ _raw_args }}</arg_value>
{%- endif -%}
</tool_call>
{%- endfor -%}
{%- endif -%}
{%- elif m.role == 'tool' -%}
{%- if m.content is string -%}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|observation|>' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- m.content }}
{{- '\n</tool_response>' }}
{%- else -%}
<|observation|>{% for tr in m.content %}
<tool_response>
{{ tr.output if tr.output is defined else tr }}
</tool_response>{% endfor -%}
{% endif -%}
{%- elif m.role == 'system' -%}
<|system|>
{{ visible_text(m.content) }}
{%- endif -%}
{%- endfor -%}
{%- if add_generation_prompt -%}
<|assistant|>{{- '\n<think></think>' if (enable_thinking is defined and not enable_thinking) else '' -}}
{%- endif -%} |
There are a few other models that will need something like this; Qwen3-Coder, Seed OSS, etc. |
- Add Qwen3CoderToolCallProcessor for nested XML format parsing - Support function=name and parameter=name attribute-based parsing - Handle multi-line parameter values in Qwen3-coder format - Create qwen3-coder-tabbyapi.jinja template with complete XML metadata - Add comprehensive test suite with 10 Qwen3-coder specific tests - Restrict GLM45ToolCallProcessor to only 'glm45' (remove glm-4.5, glm4 aliases) - Restrict Qwen3CoderToolCallProcessor to only 'qwen3-coder' (not qwen3) - Rename documentation to XML-Tool-Calling-Implementation.md - Update documentation to cover both GLM-4.5 and Qwen3-coder formats - All 28 tests passing with proper processor restrictions This enables Qwen3-coder models to work with TabbyAPI's tool calling system while maintaining strict separation between different model formats.
Qwen3-Coder template {#- TabbyAPI-compatible Qwen3-coder template with XML tool call processing -#}
{#- XML Tool Call Processing Configuration -#}
{%- set stop_strings = ["<|im_end|>"] -%}
{%- set tool_start = "<tool_call>" -%}
{%- set tool_end = "</tool_call>" -%}
{%- set tool_call_format = "xml" -%}
{%- set xml_processor_type = "qwen3-coder" -%}
{% macro render_extra_keys(json_dict, handled_keys) %}
{%- if json_dict is mapping %}
{%- for json_key in json_dict if json_key not in handled_keys %}
{%- if json_dict[json_key] is mapping or (json_dict[json_key] is sequence and json_dict[json_key] is not string) %}
{{- '\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | tojson | safe) ~ '</' ~ json_key ~ '>' }}
{%- else %}
{{-'\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | string) ~ '</' ~ json_key ~ '>' }}
{%- endif %}
{%- endfor %}
{%- endif %}
{% endmacro %}
{%- if messages[0]["role"] == "system" %}
{%- set system_message = messages[0]["content"] %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set loop_messages = messages %}
{%- endif %}
{%- if not tools is defined %}
{%- set tools = [] %}
{%- endif %}
{%- if system_message is defined %}
{{- "<|im_start|>system\n" + system_message }}
{%- else %}
{%- if tools is iterable and tools | length > 0 %}
{{- "<|im_start|>system\nYou are Qwen, a helpful AI assistant that can interact with a computer to solve tasks." }}
{%- endif %}
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
{{- "\n\n# Tools\n\nYou have access to the following functions:\n\n" }}
{{- "<tools>" }}
{%- for tool in tools %}
{%- if tool.function is defined %}
{%- set tool = tool.function %}
{%- endif %}
{{- "\n<function>\n<name>" ~ tool.name ~ "</name>" }}
{%- if tool.description is defined %}
{{- '\n<description>' ~ (tool.description | trim) ~ '</description>' }}
{%- endif %}
{{- '\n<parameters>' }}
{%- if tool.parameters is defined and tool.parameters is mapping and tool.parameters.properties is defined and tool.parameters.properties is mapping %}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{{- '\n<parameter>' }}
{{- '\n<name>' ~ param_name ~ '</name>' }}
{%- if param_fields.type is defined %}
{{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
{%- endif %}
{%- if param_fields.description is defined %}
{{- '\n<description>' ~ (param_fields.description | trim) ~ '</description>' }}
{%- endif %}
{%- set handled_keys = ['name', 'type', 'description'] %}
{{- render_extra_keys(param_fields, handled_keys) }}
{{- '\n</parameter>' }}
{%- endfor %}
{%- endif %}
{% set handled_keys = ['type', 'properties'] %}
{{- render_extra_keys(tool.parameters, handled_keys) }}
{{- '\n</parameters>' }}
{%- set handled_keys = ['type', 'name', 'description', 'parameters'] %}
{{- render_extra_keys(tool, handled_keys) }}
{{- '\n</function>' }}
{%- endfor %}
{{- "\n</tools>" }}
{{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
{%- endif %}
{%- if system_message is defined %}
{{- '<|im_end|>\n' }}
{%- else %}
{%- if tools is iterable and tools | length > 0 %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- for message in loop_messages %}
{%- if message.role == "assistant" and message.tool_calls is defined and message.tool_calls is iterable and message.tool_calls | length > 0 %}
{{- '<|im_start|>' + message.role }}
{%- if message.content is defined and message.content is string and message.content | trim | length > 0 %}
{{- '\n' + message.content | trim + '\n' }}
{%- endif %}
{%- for tool_call in message.tool_calls %}
{%- if tool_call.function is defined %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
{%- if tool_call.arguments is defined %}
{%- for args_name, args_value in tool_call.arguments|items %}
{{- '<parameter=' + args_name + '>\n' }}
{%- set args_value = args_value | tojson | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
{{- args_value }}
{{- '\n</parameter>\n' }}
{%- endfor %}
{%- endif %}
{{- '</function>\n</tool_call>' }}
{%- endfor %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "user" or message.role == "system" or message.role == "assistant" %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
{%- elif message.role == "tool" %}
{%- if loop.previtem and loop.previtem.role != "tool" %}
{{- '<|im_start|>user\n' }}
{%- endif %}
{{- '<tool_response>\n' }}
{{- message.content }}
{{- '\n</tool_response>\n' }}
{%- if not loop.last and loop.nextitem.role != "tool" %}
{{- '<|im_end|>\n' }}
{%- elif loop.last %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %} |
In 'endpoints/OAI/utils/chat_completion.py' You're missing the tools input for _create_response;
Should be;
Or else you will get this error;
|
Thank you for catching that. Fixed. |
Tried that today with MikeRoz/GLM-4.5-exl3 (q4, revised version) and OpenWebUIs native tool calling (non-native works). Spent a few hours debugging but couldn't figure out how to make it use the tools in OpenWebUI.
None of that helped unfortunately... OpenWebUI still does not display the tool call result after the second generation pass — I see the model generating tokens, but the client never shows the function call output. Any ideas? |
I've been doing some clean room experiments with GLM and tool calling, and I've found it will randomly stop obeying the jinja template and revert to its in built knowledge. This usually cascades, into it getting confused and then sort of giving up. It also sometimes gets confused after a while or immediately sometimes when it receives back tool call results in JSON format, its sorr of a crap shoot at that point whether it recovers or not. I am having a lot more luck today, as I started fresh and ported the vllm glm XML parser into tabbyAPI, but further than that, I added something as a test, and it seems to be keeping GLM from getting confused or ignoring tool call results that cascade into spamming XML tool calls over and over, my new parser is bidirectional, so when the JSOn tool calling results hit TabbyAPI, I'm converting that back into XML call results before GLM received them. Seems to be going well, but, This model is a little unhinged when it gets confused so, I'll keep tinkering and if I get something that gives me consistent results, i will try to retro fit bits into your solution if possible. |
That's great! Native Tool Calling would make GLM 4.6 much more useful! That's one of the reasons I already thought about having to switch to VLLM, but VRAM is still limited... |
🎯 Overview
This PR implements a comprehensive XML tool call processing system that enables both GLM-4.5 and Qwen3-coder models to work seamlessly with TabbyAPI's tool calling functionality. These models generate tool calls in different XML formats, but TabbyAPI expects OpenAI's JSON format. This implementation bridges that gap with a generic, extensible solution.
🚀 Key Features
📋 What's Changed
Core Implementation
endpoints/OAI/utils/xml_tool_processors.py
BaseXMLToolCallProcessor
: Abstract base class for extensible XML processingGLM45ToolCallProcessor
: GLM-4.5 specific XML parserQwen3CoderToolCallProcessor
: Qwen3-coder specific nested XML parser (supports "qwen3-coder" only)XMLToolCallProcessorFactory
: Factory pattern for creating appropriate processorscommon/templating.py
tool_call_format
,xml_processor_type
,tool_start
,tool_end
,stop_strings
extract_metadata()
to handle complete XML configurationendpoints/OAI/utils/tools.py
from_xml()
method for XML-specific processingfrom_text()
method for automatic format detection and routingTool
classtemplates/tool_calls/qwen3-coder-tabbyapi.jinja
Quality Assurance
tests/test_xml_tool_calls.py
docs/XML-Tool-Calling-Implementation.md
🔧 Technical Details
Supported XML Formats
GLM-4.5 Format:
Qwen3-coder Format:
Both convert to OpenAI's JSON format:
Architecture
XMLToolCallProcessorFactory
creates appropriate processorsBaseXMLToolCallProcessor
enables easy extension for other modelsProcessor Restrictions
"glm45"
processor type"qwen3-coder"
processor type🧪 Testing
Test coverage includes:
📖 Usage
GLM-4.5 Models
Qwen3-coder Models
🔄 Backward Compatibility
🎉 Benefits
🔍 Files Changed
common/templating.py
- Enhanced template metadata system with complete XML supportendpoints/OAI/utils/tools.py
- Added XML processing capabilities with format detectionendpoints/OAI/utils/xml_tool_processors.py
- New XML processor system with GLM-4.5 and Qwen3-coder supporttemplates/tool_calls/qwen3-coder-tabbyapi.jinja
- New TabbyAPI-compatible Qwen3-coder templatetests/test_xml_tool_calls.py
- Comprehensive test suite covering both formatsdocs/XML-Tool-Calling-Implementation.md
- Updated documentation covering both model typesThis implementation successfully resolves XML tool calling compatibility for both GLM-4.5 and Qwen3-coder models while providing a solid foundation for supporting additional XML-based models in the future.