-
Notifications
You must be signed in to change notification settings - Fork 21
Chat completion endpoint #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Provide a chat-style completions API that accepts multi-turn context (system, user, assistant) and returns a single assistant message. - Normalizes roles (merge first system into first user, map assistant→model) and builds prompts via tokenizer chat template. This is to make it compatible for gemma based models. - Returns choices[0].message with quota metadata; additive, no breaking changes.
Added usage details about the endpoint similar to other existing endpoints
…tions Handle system message placement based on conversation starter: - Merge system into first user message when conversation starts with user - Create separate user message for system content when starting with assistant - Simplify logic by eliminating redundant loops and state tracking
@chimosky @MostlyKIGuess Please review and let me know if there are any changes needed. |
Everyone adding an endpoint for something new isn't a good thing moving forward, this would end up becoming bloated. @MostlyKIGuess We need to figure out a better way to handle this. |
@chimosky made all the necessary changes and added a refactor commit. |
app/ai.py
Outdated
except Exception as e: | ||
raise Exception(f"Error generating chat completion: {str(e)}") | ||
raise Exception(f"Error generating chat completion: {str(e)}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another no new line at EOF, you should view your changes with git diff
before making a commit as you'd easily spot these if you did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the advice . Will keep this in check the next time!
I think @mebinthattil 's custom end point should be perfect for everyone. @mebinthattil Could you share maybe a documentation on it on the website? That would be a great PR addition |
raise Exception(f"Error generating response with custom prompt: {str(e)}") | ||
|
||
def _normalize_chat_messages(self, messages: list) -> list: | ||
def _normalize_chat_messages(self, messages: list[dict]) -> list[dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ibiam suggested that it would be better to mention that it is a list of dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only changes on api.py are just spacing, makes a bit weird to review.. maybe can restore this
Yes, the whole point of #29 was so that every activity would not need to raise a PR like this for that activity's usecase. I was in a bit of a rush to get that change out and did not document it well. I shall make a comprehensive documentation about it and also update the website regarding the same. Thanks for pointing this out. @amannaik247 Try checking out how I implemented this in the speak-ai activity using this generic endpoint. |
@MostlyKIGuess @mebinthattil i am using his endpoint for one task. But it only accepts a a custom prompt and a question. There is no way to send a chat history and get a conversation style response. So for providing the context of the whole conversation this endpoint is necessary. Most popular API's have this As I mentioned in my initial PR message that all models have a seperate way to generate a response when we send a chat history because it is trained on those specific tokens. |
- Made a few changes to make it pep8 compliant. - Used an alternative clear sentence to explain a function.
5b901c5
to
ada89de
Compare
This makes sense, it'll also be great to modify the endpoint Mebin created to accept this rather than create a whole new endpoint, you can use a default value - set to None - which would have no effect if nothing is passed there. This would be easier to maintain than a whole new endpoint. |
This endpoint will serve the conversation part of the story builder feature in the write activity. As well as a general endpoint which can be utilized to generate conversational responses from the gemma model.
Let's say we are using a gemma model.
Using this to generate a reasonable model response.
This endpoint can be used by any other future activities which want to introduce a conversation related feature as well. Since the endpoint is similar to most popular chat completion endpoints, integration with sugar-ai will become easier.
PS- I tested this locally using a
gemma-3-1b-it
andQwen-2-1.5B-Instruct
models.