Skip to content

Introduce AI persona framework #1324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open

Introduce AI persona framework #1324

wants to merge 30 commits into from

Conversation

dlqqq
Copy link
Member

@dlqqq dlqqq commented Apr 15, 2025

Description

This is a major feature PR introduces an AI persona framework to Jupyter AI and allows multiple AI personas to be added per chat. Personas are analogous to "bot accounts" in other chat applications in Slack or Discord.

High-level summary of changes:

  • "Jupyternaut" has been redefined as a default AI persona provided by Jupyter AI. Jupyternaut will use the same interface as every other AI persona.

  • This PR changes Jupyternaut to only reply when @-mentioned in the chat. All AI personas, including Jupyternaut, will only reply when @-mentioned. This allows users to precisely control when & which AI personas respond.

    • This also enables future work surrounding multi-agent collaboration (MAC). AI personas are able to @-mention each other to dispatch tasks or use agentic tools only available through another persona.
  • This PR allows other packages to add fully-custom AI personas. Developers can define the name, avatar, and how new messages are handled. Custom AI personas can use any AI model or AI library of their choice, and now have full control over how new messages are handled.

Demo

Screen.Recording.2025-04-13.at.6.10.26.PM.mov

Technical summary

Jupyter AI v2 had a concept of "personas", but these were essentially just names & avatars that could only appear when custom model providers were in use. This caused several issues for developers:

  • In v2, a custom persona only appeared in the chat after the user manually switched to a custom model provider and sent a new message.
  • In v2, developers were forced to use LangChain to build a custom model provider to provide a custom persona.
  • In v2, developers could not change any details about "how" their custom model provider was being called. That was managed exclusively by DefaultChatHandler in Jupyter AI.

This PR completely re-defines the concept of a persona in Jupyter AI v3 to address these issues.

Personas are now each defined in a class extending BasePersona, an abstract base class provided by Jupyter AI. A summary of the interface is as follows:

class BasePersona(ABC):
    ychat: YChat
    manager: 'PersonaManager'
    config: ConfigManager
    log: Logger
    awareness: PersonaAwareness

    @property
    @abstractmethod
    def defaults(self) -> PersonaDefaults:
        pass
    
    @abstractmethod
    async def process_message(self, message: Message) -> None
        pass
  • Each persona instance is scoped to a single chat under self.ychat.
  • Each persona can define its own name under self.defaults. PersonaDefaults is a Pydantic data model that allows a persona to define its own name and avatar.
    • We want to eventually offer users persona-specific configuration to allow users to change the name & avatar of each persona at runtime through the Jupyter AI settings. That's why the persona name & avatar are defined under self.defaults.
  • Each persona fully defines how new messages are handled in self.process_message(). This is just a plain async function which can do anything. You can define a provider to use any AI model or AI libraries of your choice, as long as they are installed in your Python environment.
    • Writing/streaming replies should be done through the methods available on self.ychat, e.g. self.ychat.add_message().

To help orchestrate personas for each chat, this PR also defines a new PersonaManager class. A new PersonaManager is initialized for each new YChat instance automatically. This class helps initialize the set of BasePersona instances for each chat.

To allow other packages to install personas, persona classes are loaded from the "jupyter_ai.personas" entry point group (EPG) when the server extension starts. Any third-party package can define a persona class and provide it to this EPG to add a custom persona to Jupyter AI.

Architecture changes

  • The chat handlers in v2 no longer do anything in this branch.
  • The JupyternautPersona implementation now fully defines how Jupyternaut handles new messages, superseding DefaultChatHandler.

Related issues

Other details

  • This PR is still a WIP.
  • We may add demo implementations of custom personas to the jupyter_ai_test package. That way, others can switch to this branch and test it via jlpm dev-reinstall.

@3coins
Copy link
Collaborator

3coins commented May 5, 2025

@dlqqq
While the overall concept for personas is spot on and aligns with the growing interest in multi-agent use-case, I think the resulting UX is not optimal with the opinionated @persona approach. Jupyter AI consumers (users and extension authors) would require more flexibility in handing the persona calls, for example in chats with a single persona and human, use of @persona is redundant (and annoying). Another example is in turn-by-turn conversations where follow up questions are a very natural part of a conversation, having users add @persona to each of these adds friction to the chat experience. Did we consider making the PersonaManager a configurable traitlet, so extension authors can control the behavior of route_message method. This opens up the possibility of having another agent(LLM or service) or a set of heuristics to route the user messages, making the message routing completely natural and flexible.

Some other implementation specific observations:

  1. It seems like the current approach is to add all registered persona as users during init, regardless of whether they participated (@persona) in the conversation or not. This makes it harder to figure out the actual participants of the chat (this can still be done perhaps by inspecting the messages of a chat).
  2. The current Persona object has the following properties that help identify it's purpose/skills - name, description. While name and description are helpful to some extent in identifying the things that this persona can do, it doesn't provide the full list of things that this persona can do, which can be useful in cases where custom PersonaManager would use that information to auto-route the user messages. Might be useful to look at some of the properties defined by a2a to define the capabilities of the persona/agent?
  3. The system_prompt seems to be completely bypassed for Jupyternaut, so not sure the actual purpose of this field. It seems like a Persona can handle much of the agent creation part on it's own, so not clear how having system_prompt helps external to the Persona. Is the intended use geared towards configuring the prompt via a config object?

@fperez
Copy link
Contributor

fperez commented May 16, 2025

Thanks for this work, I'm excited to start testing it and exploring its use for some of my needs (research, teaching & industry). Given that this introduces a fairly major revamp of the current experience, I wonder how hard it would be to provide an easy path for users like me, who aren't 100% in-the-weeds of JLab extension development, to test it and provide feedback. This particular effort, I think, could really use some kicking-the-tires by users (like me) to provide real-world input on the UX, flow, extensibility, etc.

You @dlqqq mention above an easy way of testing it - I'd be grateful if that was possible (or if it already is, LMK perhaps with a minimal explanation for a non-jlab-dev-expert?), and would be delighted to dogfood this and help iterate.

@dlqqq
Copy link
Member Author

dlqqq commented May 19, 2025

@JGuinegagne @3coins I've addressed all of the feedback. Thank you both for the detailed code review! I'll fix the CI errors shortly once I'm able to see the failures again.

@3coins Thank you also for the high-level comments & for putting so much thought into how this will affect the UX. Let me respond:

I think the resulting UX is not optimal with the opinionated @persona approach. Jupyter AI consumers (users and extension authors) would require more flexibility in handing the persona calls, for example in chats with a single persona and human, use of @persona is redundant (and annoying). Another example is in turn-by-turn conversations where follow up questions are a very natural part of a conversation, having users add @persona to each of these adds friction to the chat experience.

I agree 100%. Brian & I have discussed 2 different approaches for reducing the overhead added by this while using multiple personas:

  1. Implement direct messages (DMs) in Jupyter Chat to allow you to converse w/ a single AI persona directly, dropping the need for @-mentioning.

  2. Implement threads in Jupyter Chat. We would allow users to reply directly to an AI persona response & create a new thread, just like Slack & Discord. Inside a thread, only the original AI persona will reply, so @-mentioning can also be dropped there.

Did we consider making the PersonaManager a configurable traitlet

Right now, functionality is the top priority as we are trying to make sure v3.0.0 gets released in a timely fashion. Brian had suggested that we should aggressively de-prioritize anything that can be done in a future minor release, including adding configurables.

  1. [...] This makes it harder to figure out the actual participants of the chat

I don't know of any chat application that provides this capability, so I'm not sure if this feature is of interest to JAI users. For example, in Slack, Discord, and iMessage, the list of users is just whoever's been added to that chat. Most users added to a large channel / group chat may never post a message, yet they still appear in the users panel.

  1. [...] Might be useful to look at some of the properties defined by a2a to define the capabilities of the persona/agent?

These are great ideas. However, I think auto-routing for multiple agents & multi-agent collaboration will have to come after we implement single-agent collaboration (e.g. getting a single agent to handle queries & use tools). For now, I think the focus is "capability first, usability next".

  1. The system_prompt seems to be completely bypassed for Jupyternaut, so not sure the actual purpose of this field. It seems like a Persona can handle much of the agent creation part on it's own, so not clear how having system_prompt helps external to the Persona. Is the intended use geared towards configuring the prompt via a config object?

Yes, we eventually want it to be configurable, which is why it is an evaluated property. However, the system prompt is hard-coded as we don't have a way of editing the system prompt through the ConfigManager.

@dlqqq dlqqq changed the title [WIP, MAJOR] Introduce AI persona framework Introduce AI persona framework May 19, 2025
@dlqqq
Copy link
Member Author

dlqqq commented May 19, 2025

@fperez Hey Fernando! Thanks for checking in and expressing interest in helping us test this new feature. We also have some demos of this ready.

After this PR is merged, I'll open a new issue that includes instructions on how to add a custom persona. I'll be sure to mention you there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v3.0.0ax] Jupyternaut always replies
4 participants