Skip to content

Conversation

amannaik247
Copy link
Member

@amannaik247 amannaik247 commented Sep 17, 2025

This endpoint will serve the conversation part of the story builder feature in the write activity. As well as a general endpoint which can be utilized to generate conversational responses from the gemma model.

Let's say we are using a gemma model.

  • The gemma models are trained using specific tokens to generate model responses using a chat history. More can be understood in this documentation.
  • The main purpose of the endpoint is to take input of the chat history as a list(along with the system prompt in it), like this one:
"messages": [
    {"role": "system", "content": "You are a helpful AI assistant.Reply in 2-3 sentences."},
    {"role": "user", "content": "What is machine learning?"},
    {"role": "assistant", "content": "Machine learning is a subset of artificial intelligence..."},
    {"role": "user", "content": "Can you give me a simple example?"}
  ]
  • Then convert it to LLM compatible format like this one for gemma:
<start_of_turn>user
You are a helpful AI assistant.Reply in 2-3 sentences.\n What is machine learning?<end_of_turn>
<start_of_turn>model
Machine learning is a subset of artificial intelligence...<end_of_turn>
<start_of_turn>user
Can you give me a simple example?<end_of_turn>
  • Using this to generate a reasonable model response.

  • This endpoint can be used by any other future activities which want to introduce a conversation related feature as well. Since the endpoint is similar to most popular chat completion endpoints, integration with sugar-ai will become easier.

PS- I tested this locally using a gemma-3-1b-it and Qwen-2-1.5B-Instruct models.

- Provide a chat-style completions API that accepts multi-turn context
  (system, user, assistant) and returns a single assistant message.
- Normalizes roles (merge first system into first user, map assistant→model) and builds prompts via tokenizer chat template. This is to make it compatible for gemma based models.
- Returns choices[0].message with quota metadata; additive, no breaking changes.
Added usage details about the endpoint similar to other existing
endpoints
…tions

Handle system message placement based on conversation starter:
- Merge system into first user message when conversation starts with user
- Create separate user message for system content when starting with assistant
- Simplify logic by eliminating redundant loops and state tracking
@amannaik247
Copy link
Member Author

@chimosky @MostlyKIGuess Please review and let me know if there are any changes needed.

@chimosky
Copy link
Member

Everyone adding an endpoint for something new isn't a good thing moving forward, this would end up becoming bloated.

@MostlyKIGuess We need to figure out a better way to handle this.

@amannaik247
Copy link
Member Author

@chimosky made all the necessary changes and added a refactor commit.

app/ai.py Outdated
except Exception as e:
raise Exception(f"Error generating chat completion: {str(e)}")
raise Exception(f"Error generating chat completion: {str(e)}")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another no new line at EOF, you should view your changes with git diff before making a commit as you'd easily spot these if you did.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the advice . Will keep this in check the next time!

@MostlyKIGuess
Copy link
Member

Everyone adding an endpoint for something new isn't a good thing moving forward, this would end up becoming bloated.

@MostlyKIGuess We need to figure out a better way to handle this.

I think @mebinthattil 's custom end point should be perfect for everyone.

@mebinthattil Could you share maybe a documentation on it on the website? That would be a great PR addition

raise Exception(f"Error generating response with custom prompt: {str(e)}")

def _normalize_chat_messages(self, messages: list) -> list:
def _normalize_chat_messages(self, messages: list[dict]) -> list[dict]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ibiam suggested that it would be better to mention that it is a list of dictionary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only changes on api.py are just spacing, makes a bit weird to review.. maybe can restore this

@mebinthattil
Copy link
Member

Everyone adding an endpoint for something new isn't a good thing moving forward, this would end up becoming bloated.
@MostlyKIGuess We need to figure out a better way to handle this.

I think @mebinthattil 's custom end point should be perfect for everyone.

@mebinthattil Could you share maybe a documentation on it on the website? That would be a great PR addition

Yes, the whole point of #29 was so that every activity would not need to raise a PR like this for that activity's usecase. I was in a bit of a rush to get that change out and did not document it well. I shall make a comprehensive documentation about it and also update the website regarding the same. Thanks for pointing this out.

@amannaik247 Try checking out how I implemented this in the speak-ai activity using this generic endpoint.

@amannaik247
Copy link
Member Author

amannaik247 commented Sep 26, 2025

I think @mebinthattil 's custom end point should be perfect for everyone.

Yes, the whole point of #29 was so that every activity would not need to raise a PR like this for that activity's usecase. I was in a bit of a rush to get that change out and did not document it well. I shall make a comprehensive documentation about it and also update the website regarding the same. Thanks for pointing this out.

@MostlyKIGuess @mebinthattil i am using his endpoint for one task. But it only accepts a a custom prompt and a question. There is no way to send a chat history and get a conversation style response. So for providing the context of the whole conversation this endpoint is necessary. Most popular API's have this /chat/completions endpoint.

As I mentioned in my initial PR message that all models have a seperate way to generate a response when we send a chat history because it is trained on those specific tokens.

- Made a few changes to make it pep8 compliant.
- Used an alternative clear sentence to explain a function.
@chimosky
Copy link
Member

@MostlyKIGuess @mebinthattil i am using his endpoint for one task. But it only accepts a a custom prompt and a question. There is no way to send a chat history and get a conversation style response. So for providing the context of the whole conversation this endpoint is necessary. Most popular API's have this /chat/completions endpoint.

This makes sense, it'll also be great to modify the endpoint Mebin created to accept this rather than create a whole new endpoint, you can use a default value - set to None - which would have no effect if nothing is passed there.

This would be easier to maintain than a whole new endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants