EPP cannot serve /chat/completions API #790
Labels
kind/bug
Categorizes issue or PR as related to a bug.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
What happened:
Previously, with EPP we can always server
/chat/completions
API as well as the/completions
API. This behavior seems to be broken in the latest version of EPP. When we test our inference service with/chat/completions
API using the latest EPP, we will get the following error:I believe that this 400 status is caused by EPP because I have found following errors in EPP logs:
What you expected to happen:
I think that
/chat/completions
could be served well just as/completions
.How to reproduce it (as minimally and precisely as possible):
Just use the latest EPP and test with
/chat/completions
as given.Anything else we need to know?:
After searching, I have found that this line of code causes the error:
gateway-api-inference-extension/pkg/epp/handlers/request.go
Line 45 in 35d7f64
I think this could be optimized to support
/chat/completions
API easily by adding a util function which extract prompt from request body. For/chat/completions
API, the prompts are embedded inmessages
field, we just need to handle this situation.If this issue could be confirmed by maintainers, I can assign this to myself and provide a fix for it.
Environment:
kubectl version
):git describe --tags --dirty --always
):The text was updated successfully, but these errors were encountered: