-
Notifications
You must be signed in to change notification settings - Fork 312
(Feat) Implement remote Agent type
#821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
…luded in the final status update Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
168d68c to
74cf718
Compare
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
|
holy moly, there's a lot of merge conflicts. update: resolved merge conflicts locally. unsure if it's all still working. need to test it again + do a self code review to make sure I didn't remove any work from main during merging. need to find my reproduction setup again :rip: |
Signed-off-by: Fabian Gonzalez <[email protected]>
…t variety of token usage storage Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
…tore task information from remote agents, instead of handling its own Signed-off-by: Fabian Gonzalez <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements support for remote agents in the kagent system, allowing for A2A (Agent-to-Agent) communication with agents hosted on external servers. The implementation includes the creation of a new "Remote" agent type that uses agent cards to discover and communicate with external agents.
Key changes include:
- Added "Remote" agent type with agent card URL and optional server URL configuration
- Implemented UI components for creating, editing, and previewing remote agents
- Created middleware for recording remote agent interactions in the database
- Extended the A2A protocol handling to support remote agent task management
Reviewed Changes
Copilot reviewed 35 out of 37 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/types/index.ts | Added Remote agent type and RemoteAgentSpec interface |
| ui/src/lib/messageHandlers.ts | Enhanced token usage extraction for remote agents and improved artifact streaming |
| ui/src/components/sidebars/AgentDetailsSidebar.tsx | Updated to handle remote agents without model display |
| ui/src/components/create/SelectToolsDialog.tsx | Added null safety for search terms and descriptions |
| ui/src/components/AgentsProvider.tsx | Added remote agent validation and form data types |
| ui/src/components/AgentCardPreview.tsx | New component for previewing remote agent cards |
| ui/src/components/AgentCard.tsx | Added preview functionality for remote agents |
| ui/src/app/agents/new/page.tsx | Extended agent creation form with remote agent fields |
| ui/src/app/actions/utils.ts | Fixed error response to include error field |
| ui/src/app/actions/agents.ts | Added getAgentCard API call and remote agent form handling |
| go/api/v1alpha2/agent_types.go | Added Remote agent type and RemoteAgentSpec to CRD |
| go/internal/controller/translator/adk_api_translator.go | Added remote agent card fetching and translation |
| go/internal/controller/reconciler/reconciler.go | Enhanced reconciler to handle remote agents and store agent cards |
| go/internal/a2a/recording_manager.go | New recording manager for remote agent task/session storage |
| go/internal/a2a/a2a_handler_mux.go | Updated A2A handler to use recording manager for remote agents |
| go/internal/httpserver/handlers/agents.go | Added agent card retrieval endpoint |
| go/internal/database/models.go | Extended Agent model with remote config and agent card storage |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
EItanya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall design makes sense, but there are some details that need to be worked out before we can merge this.
go/internal/a2a/recording_manager.go
Outdated
| if err := m.dbClient.StoreTask(task); err != nil { | ||
| logger.Error(err, "Failed to store sync task", "taskID", task.ID, "contextID", task.ContextID) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we failing silently here? The whole point of this functionality is to record this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I did a more explicit get -> if not exist, create for the Session.
If a session did not exist (or get created successfully), then we do not store a task. This is assuming a session -> tasks relation. If we allow for Tasks to exists without a Session, I could change this to always create a Task.
|
|
||
| // Marshal remote agent's AgentCard to store in DB | ||
| var serializedCard string | ||
| if agent.Spec.Type == v1alpha2.AgentType_Remote && agentOutputs != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would a remote agent have a card stored from the translation? If we're going to store the agent card we should do it in a separate controller from this one since there's this required async logic of the card.
| Description: toolAgent.Spec.Description, | ||
| }) | ||
| case v1alpha2.AgentType_Remote: | ||
| /* TODO: Add support for remote agents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The translator is a pure function, if we are going to make a network call we should do it in the reconciler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Just need to figure out how the polling of agent card ties into this a bit better. I was/am planning on handling polling logic in a follow-up to avoid blowing up this PR with diffs.
Self-thought/note:
I think we do something along these lines for MCP tools, where there's a watcher that updates agents when mcp tools update. So something like that would be done for remote agent-as-a-tool, where updates would occur when they get re-polled (or when the remote agent reconciles).
Additionally, doing something like storing the agent card hashed in the remote agent annotation could kick off its reconciliation, which would should then update the usages of it as a tool from agents watching it.
Would probably be worth huddling (unless this is an easy yes/no). For using it as a tool we can:
- Add a watcher, so when remote agents reconcile, agents using them as a tool do so as well
- During reconciliation, remote agent urls/configs are updated based on latest data (from the database)
This way it stays up to date.
When we implement polling, we'll need to see how to handle it (in my opinion adding an annotation on remote agents that is a hash of the config should be good. this way it forces a reconciliation + caller agents update). I'm trying this locally. It works in theory, but getting a 503 adk error when the caller agent tries calling the remote agent 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a whole separate agent for this? Can't we just treat it as remote in the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to play around with this idea a bit more.
Initially I was against it because it would mean we need the BYO kebab Agent to be created first, so that its deployment + service exists and the Remote Agent can reference the service URL. But should be simple to deal with.
Now, after trying it out: the remote agent test passes, but when I try to chat with the remote agent...
Response in chat:
Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
Kebab agent's deployment:
INFO: 10.244.0.51:43074 - "POST / HTTP/1.1" 200 OK
ERROR:google_adk.kagent.adk._agent_executor:Error handling A2A request: Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
Traceback (most recent call last):
File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_agent_executor.py", line 125, in execute
await self._handle_request(context, event_queue, runner)
File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_agent_executor.py", line 206, in _handle_request
async for adk_event in runner.run_async(**run_args):
...<4 lines>...
await event_queue.enqueue_event(a2a_event)
File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 250, in run_async
async for event in agen:
yield event
File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 228, in _run_with_trace
await self._append_new_message_to_session(
...<5 lines>...
)
File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 357, in _append_new_message_to_session
await self.session_service.append_event(session=session, event=event)
File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_session_service.py", line 168, in append_event
response.raise_for_status()
~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/.kagent/python/.venv/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
INFO: 10.244.0.1:51740 - "GET /health HTTP/1.1" 200 OK
INFO: 10.244.0.1:46628 - "GET /health HTTP/1.1" 200 OK
I'll need to dig into this, but it's probable that it's due to the remote agent a2a storing information in the DB and the other agent doing so as well (since byo + declarative agents also store in the db themselves). I'm not even sure if this would be a "real world" issue, since a user likely wouldn't be defining an agent created by kagent as a remote agent. (unless there's valid cross-cluster use cases for this, but in this case, it would be good to config remote agent with a bool field to not store in the db)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so while the test (somehow) passes, the session storage is a big issue if the remote agent writes to the database as well.
kagent >> get session
+---+------------------------------------------+--------------------+--------------------------------+----------------------+
| # | ID | NAME | AGENT | CREATED |
+---+------------------------------------------+--------------------+--------------------------------+----------------------+
| 1 | ctx-5b33e4b2-d2d6-4854-955c-0f6dea1a282f | remote-kebab-agent | kagent__NS__remote_kebab_agent | 2025-09-17T22:54:21Z |
+---+------------------------------------------+--------------------+--------------------------------+----------------------+
remote agent "owns" this session id, so when the agent it communicates to tries accessing/writing it, it fails. which makes sense.
So what happens is that
- The remote Agent handler creates a Session (as expected)
- It communicates with the other Agent
- The other agent was created by us, meaning it has DB access
- The other agent tries creating/getting the Session
- The other agent fails and returns an error, because it already exists and is owned by the Remote Agent.
If we want to allow the case where a remote agent is one which already writes its sessions to the DB, then we'll want to add a spec field to disable the database writing -- or do something else where it passes a different context/id to the remote agent. The main issue here would be figuring out how to handle their own sessions/display.
[...]
type: Remote
remote:
[urls...]
persistData: false # or persistTask/Session, persist, store, etc..…nt check Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Fabian Gonzalez <[email protected]>
…ues as tool Signed-off-by: Fabian Gonzalez <[email protected]>
Signed-off-by: Peter Jausovec <[email protected]>
* main: Fix UI/streaming timeouts for long running LLM requests (kagent-dev#907) fix helm value for env (kagent-dev#910) feat: allow per-agent header configuration for tools (kagent-dev#884) feat: Set system message from ConfigMap or Secrets (kagent-dev#894) Signed-off-by: Peter Jausovec <[email protected]>
51d1d2e to
15801b4
Compare
Signed-off-by: Peter Jausovec <[email protected]>
Signed-off-by: Peter Jausovec <[email protected]>
Signed-off-by: Peter Jausovec <[email protected]>
Signed-off-by: Eitan Yarmush <[email protected]>
| case v1alpha2.AgentType_Remote: | ||
| cfg.RemoteAgents = append(cfg.RemoteAgents, adk.RemoteAgentConfig{ | ||
| Name: utils.ConvertToPythonIdentifier(utils.GetObjectRef(toolAgent)), | ||
| Url: agent.Spec.Remote.DiscoveryURL, | ||
| Headers: headers, | ||
| Description: toolAgent.Spec.Description, | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Since we're not using Watches, does this mean that if a remote agent's discovery URL is updated, any Agent referencing it as a tool won't get it until they manually reconcile?
(iiirc, the deployment for declarative agents have their agent-as-tools urls "baked in" their secrets, which update on reconciliation.)
|
When will this be merged into kagent? This is a feature that I would find very valuable |
leftovers (passing to peter):
look into why the watch + reconciliation isn't happening aa2c5f1it was working during this commit, which was right before making changes to simplify the CRD. maybe just need to re-spin up cluster, or forgot a change.OnSendMessageStream, this way we 100% have a finalized Task to store. A stream could end withStatusUpdate final:true, without aTaskobject, meaning we'd have to have built it during on our end during stream.Context
Adding support for a2a with agents hosted on a remote server.
My goal is for this work is to encompass the simple bare necessities 🐻 of remote agent support, which is usable (although requires manual reconciliation for agent card updates). Then immediately work on follow-ups (polling).
Changes
TODO: Update description based on new simpler CRD (discoveryUrl)
Remotetype for Agents. Thisremoteagent type allows for a2a communications with agents served elsewhere..errorfield to catch server errors on creation, but our error wrapper only set amessage, so I ensured it also seterror.remoteConfig: similar to existing config, but for remote agents. reused existing remote config to store the main remote agent information.agentCard: only using for remote agents. used for UI/displaying purposes if a user wants to see the agent card details.agentCardinstead ofremoteConfigin the db -- they hold similar information (name, description, url), the agent card just holds more.agent carddata on the agent status instead of the db -- this is similar to something done in gloo portal where it stores the api discovery information on the status of some(?) custom resource def.agentCardpreview, so not storing the fetched data -- good for saving storage, not ideal UX.remoteConfigwhich includes agent card as[]bytedata. I think this makes most sense for simplicity's sake.TODO/Unknown
note: i need to implement a new a2a server with task handling, it looks like an update to https://github.com/kagent-dev/a2a-go is required - as this is where we handle our a2a server logic. unless we want this stateful_a2a logic to live here, but this may be a bit weird to split. we wouldn't make updates to the upstream repo our a2a-go is forked from since these changes are kagent-specific.crossing off for now while looking into this to avoid a net-new service + cross-repo changes. maybe by creating a separate task manager for remote services, this can be handled? else, we'll need to expand on the a2a-go to implement a net-new a2a server.Follow-Ups
This is a list of work I believe would work best as a follow-up to this. This would be in order to keep this PR at a reasonable size (diffs). I'm planning on implementing this/these after this gets merged, or is close-to-merge.
Updates
resolves: #820