(Feat) Implement `remote` Agent type #821

inFocus7 · 2025-08-27T17:18:14Z

leftovers (passing to peter):

~~look into why the watch + reconciliation isn't happening aa2c5f1~~
- ~~it was working during this commit, which was right before making changes to simplify the CRD. maybe just need to re-spin up cluster, or forgot a change.~~
- found out reason why i believed it wasn't working (an invalid url causes a reconciliation error on the remote agent, which means we don't kick off reconciliation for its callers. expected)
the ui remote agent creation is stating it's not a valid type. started occuring after crd simplication, so maybe missed something there.
Eitan's comment regarding managing a Task manually on our OnSendMessageStream, this way we 100% have a finalized Task to store. A stream could end with StatusUpdate final:true, without a Task object, meaning we'd have to have built it during on our end during stream.

Context

Adding support for a2a with agents hosted on a remote server.

My goal is for this work is to encompass the simple bare necessities 🐻 of remote agent support, which is usable (although requires manual reconciliation for agent card updates). Then immediately work on follow-ups (polling).

Changes

TODO: Update description based on new simpler CRD (discoveryUrl)

Adds a Remote type for Agents. This remote agent type allows for a2a communications with agents served elsewhere.
- There are two fields, the agent card url (required) and the server url (optional, to override the agent card's url for a2a).
- Status is only based on reconciliation.
Adds remote agent creation + editing functionality in UI.
- When working on this, i noticed we check for .error field to catch server errors on creation, but our error wrapper only set a message, so I ensured it also set error.
Storing new details in the database
- remoteConfig: similar to existing config, but for remote agents. reused existing remote config to store the main remote agent information.
- agentCard: only using for remote agents. used for UI/displaying purposes if a user wants to see the agent card details.
  - I stored this to allow users to preview the agent card. it makes most sense for them to view it based on the latest fetch stored versus dynamically fetching with agent url whenever they want to preview. what they see (agent card) should be what they have (latest fetched state).
  - Open to other ideas (i'm listing a few alternatives below)
    1. storing only agentCard instead of remoteConfig in the db -- they hold similar information (name, description, url), the agent card just holds more.
    2. storing agent card data on the agent status instead of the db -- this is similar to something done in gloo portal where it stores the api discovery information on the status of some(?) custom resource def.
    3. not allowing agentCard preview, so not storing the fetched data -- good for saving storage, not ideal UX.
    4. New custom remoteConfig which includes agent card as []byte data. I think this makes most sense for simplicity's sake.

TODO/Unknown

session/task storage
- Eitan brought up about a different approach for remote agent storage. I'll follow up on that during the review cycle, or at least after resolving most of these todos.
- Currently, I opted to hoping(?) the remote agent themselves are implemented with their own task storage/setup, and we'd simply be storing the tasks (with remote agent-provided information) on the database for future fetching.
- We created a new manager acting as a middleware for remote agent communication. This would handle storing tasks/sessions to our storage.
- note: i need to implement a new a2a server with task handling, it looks like an update to https://github.com/kagent-dev/a2a-go is required - as this is where we handle our a2a server logic. unless we want this stateful_a2a logic to live here, but this may be a bit weird to split. we wouldn't make updates to the upstream repo our a2a-go is forked from since these changes are kagent-specific. crossing off for now while looking into this to avoid a net-new service + cross-repo changes. maybe by creating a separate task manager for remote services, this can be handled? else, we'll need to expand on the a2a-go to implement a net-new a2a server.

Follow-Ups

This is a list of work I believe would work best as a follow-up to this. This would be in order to keep this PR at a reasonable size (diffs). I'm planning on implementing this/these after this gets merged, or is close-to-merge.

agent card re-fetch poll
- Question: Should polling be configured on a per-remote agent basis? Or a global polling that batch polls/updates all remote agents?
allow for remote agent choosing in tools & agents selection.
- currently unsupported. this would be implemented after the polling implementation to better understand how the information fetching + updating would work through agent-as-a-subagent.
- I tried a setup locally which half worked, which did the following:
  1. updated the manifests (secrets + deployment hash) for declarative agents using remote agents as tools, so they query the db for information.
  2. added a watcher so when remote agents reconcile, we also do the same for agents calling them as tools (to ensure they get updated configs)
- An issue with the above is how we setup agents as tools in our kagent/adk code. When we create the agent as tool, we assume that the url (server) is hosted in the same path as the agent card. This is a bad assumption for remote agents, as they can differ. Another thing it does is then use the server url configured on the agent card. This is also a bad assumption, as it would use the original server's agent card which would not hold the server override (if set). We would want to expose our own agent card handler that displays the agent card/config stored in the db (which holds the overriden url).
auth for secured remote agents(?)

Updates

i'll ask if we can implement polling in a fast-follow (well, as fast as i can implement). There's a decent amount of additions in this PR for its initial support. It would be easier to get this reviewed and merged, then implement polling as a separate PR imo.
- if i do this polling in a separate PR, i'll also look into the remote-agent-as-called-agent implementation since polling would affect it.

resolves: #820

Signed-off-by: Fabian Gonzalez <[email protected]>

…luded in the final status update Signed-off-by: Fabian Gonzalez <[email protected]>

Signed-off-by: Fabian Gonzalez <[email protected]>

inFocus7 · 2025-09-10T18:36:46Z

holy moly, there's a lot of merge conflicts. ~~not sure when/if picking this up again, but note to self: it may be easier to begin a new branch and manually pick over changes.~~

update: resolved merge conflicts locally. unsure if it's all still working. need to test it again + do a self code review to make sure I didn't remove any work from main during merging.

need to find my reproduction setup again :rip:

Signed-off-by: Fabian Gonzalez <[email protected]>

…t variety of token usage storage Signed-off-by: Fabian Gonzalez <[email protected]>

Signed-off-by: Fabian Gonzalez <[email protected]>

…tore task information from remote agents, instead of handling its own Signed-off-by: Fabian Gonzalez <[email protected]>

Copilot

Pull Request Overview

This PR implements support for remote agents in the kagent system, allowing for A2A (Agent-to-Agent) communication with agents hosted on external servers. The implementation includes the creation of a new "Remote" agent type that uses agent cards to discover and communicate with external agents.

Key changes include:

Added "Remote" agent type with agent card URL and optional server URL configuration
Implemented UI components for creating, editing, and previewing remote agents
Created middleware for recording remote agent interactions in the database
Extended the A2A protocol handling to support remote agent task management

Reviewed Changes

Copilot reviewed 35 out of 37 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
ui/src/types/index.ts	Added Remote agent type and RemoteAgentSpec interface
ui/src/lib/messageHandlers.ts	Enhanced token usage extraction for remote agents and improved artifact streaming
ui/src/components/sidebars/AgentDetailsSidebar.tsx	Updated to handle remote agents without model display
ui/src/components/create/SelectToolsDialog.tsx	Added null safety for search terms and descriptions
ui/src/components/AgentsProvider.tsx	Added remote agent validation and form data types
ui/src/components/AgentCardPreview.tsx	New component for previewing remote agent cards
ui/src/components/AgentCard.tsx	Added preview functionality for remote agents
ui/src/app/agents/new/page.tsx	Extended agent creation form with remote agent fields
ui/src/app/actions/utils.ts	Fixed error response to include error field
ui/src/app/actions/agents.ts	Added getAgentCard API call and remote agent form handling
go/api/v1alpha2/agent_types.go	Added Remote agent type and RemoteAgentSpec to CRD
go/internal/controller/translator/adk_api_translator.go	Added remote agent card fetching and translation
go/internal/controller/reconciler/reconciler.go	Enhanced reconciler to handle remote agents and store agent cards
go/internal/a2a/recording_manager.go	New recording manager for remote agent task/session storage
go/internal/a2a/a2a_handler_mux.go	Updated A2A handler to use recording manager for remote agents
go/internal/httpserver/handlers/agents.go	Added agent card retrieval endpoint
go/internal/database/models.go	Extended Agent model with remote config and agent card storage

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

ui/src/lib/messageHandlers.ts

go/internal/httpserver/handlers/agents.go

go/internal/controller/translator/adk_api_translator.go

Makefile

go/internal/a2a/recording_manager.go

Signed-off-by: Fabian Gonzalez <[email protected]>

ui/src/lib/messageHandlers.ts

Signed-off-by: Fabian Gonzalez <[email protected]>

EItanya

The overall design makes sense, but there are some details that need to be worked out before we can merge this.

go/internal/httpserver/server.go

go/internal/a2a/recording_manager.go

EItanya · 2025-09-17T18:04:36Z

go/internal/a2a/recording_manager.go

+			if err := m.dbClient.StoreTask(task); err != nil {
+				logger.Error(err, "Failed to store sync task", "taskID", task.ID, "contextID", task.ContextID)
+			}


Why are we failing silently here? The whole point of this functionality is to record this

Done, I did a more explicit get -> if not exist, create for the Session.

If a session did not exist (or get created successfully), then we do not store a task. This is assuming a session -> tasks relation. If we allow for Tasks to exists without a Session, I could change this to always create a Task.

go/internal/a2a/recording_manager.go

EItanya · 2025-09-17T18:10:05Z

go/internal/controller/reconciler/reconciler.go

+
+	// Marshal remote agent's AgentCard to store in DB
+	var serializedCard string
+	if agent.Spec.Type == v1alpha2.AgentType_Remote && agentOutputs != nil {


How would a remote agent have a card stored from the translation? If we're going to store the agent card we should do it in a separate controller from this one since there's this required async logic of the card.

EItanya · 2025-09-17T18:13:10Z

go/internal/controller/translator/adk_api_translator.go

 					Description: toolAgent.Spec.Description,
 				})
+			case v1alpha2.AgentType_Remote:
+				/* TODO: Add support for remote agents.


The translator is a pure function, if we are going to make a network call we should do it in the reconciler

Makes sense. Just need to figure out how the polling of agent card ties into this a bit better. I was/am planning on handling polling logic in a follow-up to avoid blowing up this PR with diffs.

Self-thought/note:
I think we do something along these lines for MCP tools, where there's a watcher that updates agents when mcp tools update. So something like that would be done for remote agent-as-a-tool, where updates would occur when they get re-polled (or when the remote agent reconciles).

Additionally, doing something like storing the agent card hashed in the remote agent annotation could kick off its reconciliation, which would should then update the usages of it as a tool from agents watching it.

Would probably be worth huddling (unless this is an easy yes/no). For using it as a tool we can:

Add a watcher, so when remote agents reconcile, agents using them as a tool do so as well

During reconciliation, remote agent urls/configs are updated based on latest data (from the database)
This way it stays up to date.

When we implement polling, we'll need to see how to handle it (in my opinion adding an annotation on remote agents that is a hash of the config should be good. this way it forces a reconciliation + caller agents update). I'm trying this locally. It works in theory, but getting a 503 adk error when the caller agent tries calling the remote agent 🤔

EItanya · 2025-09-17T18:13:55Z

go/test/e2e/agents/remote-kebab/app.py

Why do we need a whole separate agent for this? Can't we just treat it as remote in the tests?

I'm going to play around with this idea a bit more.

Initially I was against it because it would mean we need the BYO kebab Agent to be created first, so that its deployment + service exists and the Remote Agent can reference the service URL. But should be simple to deal with.

Now, after trying it out: the remote agent test passes, but when I try to chat with the remote agent...

Response in chat:

Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403

Kebab agent's deployment:

INFO: 10.244.0.51:43074 - "POST / HTTP/1.1" 200 OK ERROR:google_adk.kagent.adk._agent_executor:Error handling A2A request: Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403 Traceback (most recent call last): File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_agent_executor.py", line 125, in execute await self._handle_request(context, event_queue, runner) File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_agent_executor.py", line 206, in _handle_request async for adk_event in runner.run_async(**run_args): ...<4 lines>... await event_queue.enqueue_event(a2a_event) File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 250, in run_async async for event in agen: yield event File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 228, in _run_with_trace await self._append_new_message_to_session( ...<5 lines>... ) File "/.kagent/python/.venv/lib/python3.13/site-packages/google/adk/runners.py", line 357, in _append_new_message_to_session await self.session_service.append_event(session=session, event=event) File "/.kagent/python/packages/kagent-adk/src/kagent/adk/_session_service.py", line 168, in append_event response.raise_for_status() ~~~~~~~~~~~~~~~~~~~~~~~~~^^ File "/.kagent/python/.venv/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status raise HTTPStatusError(message, request=request, response=self) httpx.HTTPStatusError: Client error '403 Forbidden' for url 'http://kagent-controller.kagent:8083/api/sessions/ctx-a6f000bf-7e85-4465-ab06-f08dd9be04d7/[email protected]' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403 INFO: 10.244.0.1:51740 - "GET /health HTTP/1.1" 200 OK INFO: 10.244.0.1:46628 - "GET /health HTTP/1.1" 200 OK

I'll need to dig into this, but it's probable that it's due to the remote agent a2a storing information in the DB and the other agent doing so as well (since byo + declarative agents also store in the db themselves). I'm not even sure if this would be a "real world" issue, since a user likely wouldn't be defining an agent created by kagent as a remote agent. (unless there's valid cross-cluster use cases for this, but in this case, it would be good to config remote agent with a bool field to not store in the db)

Yeah, so while the test (somehow) passes, the session storage is a big issue if the remote agent writes to the database as well.

kagent >> get session +---+------------------------------------------+--------------------+--------------------------------+----------------------+ | # | ID | NAME | AGENT | CREATED | +---+------------------------------------------+--------------------+--------------------------------+----------------------+ | 1 | ctx-5b33e4b2-d2d6-4854-955c-0f6dea1a282f | remote-kebab-agent | kagent__NS__remote_kebab_agent | 2025-09-17T22:54:21Z | +---+------------------------------------------+--------------------+--------------------------------+----------------------+

remote agent "owns" this session id, so when the agent it communicates to tries accessing/writing it, it fails. which makes sense.

So what happens is that

The remote Agent handler creates a Session (as expected)

It communicates with the other Agent

The other agent was created by us, meaning it has DB access

The other agent tries creating/getting the Session

The other agent fails and returns an error, because it already exists and is owned by the Remote Agent.

If we want to allow the case where a remote agent is one which already writes its sessions to the DB, then we'll want to add a spec field to disable the database writing -- or do something else where it passes a different context/id to the remote agent. The main issue here would be figuring out how to handle their own sessions/display.

[...] type: Remote remote: [urls...] persistData: false # or persistTask/Session, persist, store, etc..

…nt check Signed-off-by: Fabian Gonzalez <[email protected]>

Signed-off-by: Fabian Gonzalez <[email protected]>

…ues as tool Signed-off-by: Fabian Gonzalez <[email protected]>

Signed-off-by: Peter Jausovec <[email protected]>

* main: Fix UI/streaming timeouts for long running LLM requests (kagent-dev#907) fix helm value for env (kagent-dev#910) feat: allow per-agent header configuration for tools (kagent-dev#884) feat: Set system message from ConfigMap or Secrets (kagent-dev#894) Signed-off-by: Peter Jausovec <[email protected]>

Signed-off-by: Peter Jausovec <[email protected]>

Signed-off-by: Eitan Yarmush <[email protected]>

inFocus7 · 2025-09-22T20:43:12Z

go/internal/controller/translator/adk_api_translator.go

+			case v1alpha2.AgentType_Remote:
+				cfg.RemoteAgents = append(cfg.RemoteAgents, adk.RemoteAgentConfig{
+					Name:        utils.ConvertToPythonIdentifier(utils.GetObjectRef(toolAgent)),
+					Url:         agent.Spec.Remote.DiscoveryURL,
+					Headers:     headers,
+					Description: toolAgent.Spec.Description,
+				})


Q: Since we're not using Watches, does this mean that if a remote agent's discovery URL is updated, any Agent referencing it as a tool won't get it until they manually reconcile?

(iiirc, the deployment for declarative agents have their agent-as-tools urls "baked in" their secrets, which update on reconciliation.)

ae12345678910 · 2025-11-07T15:20:10Z

When will this be merged into kagent? This is a feature that I would find very valuable

inFocus7 added 5 commits August 28, 2025 10:48

initial remote agent crd translation setup

9404e22

Signed-off-by: Fabian Gonzalez <[email protected]>

add remote agent to agent creation ui

a9dd0a3

Signed-off-by: Fabian Gonzalez <[email protected]>

update message handler to display finalized messages that are not inc…

e201039

…luded in the final status update Signed-off-by: Fabian Gonzalez <[email protected]>

create new recording task manager for remote agent a2a chat storage

d9a9286

Signed-off-by: Fabian Gonzalez <[email protected]>

reverse requirements of agent card v. agent server

74cf718

Signed-off-by: Fabian Gonzalez <[email protected]>

inFocus7 force-pushed the feat/remote-agent-type branch from 168d68c to 74cf718 Compare August 28, 2025 14:48

inFocus7 added 3 commits August 28, 2025 12:53

return error on createErrorResponse so errors don't get ignored

cf3b890

Signed-off-by: Fabian Gonzalez <[email protected]>

add error on remote agent-as-tool request

c5b74ce

Signed-off-by: Fabian Gonzalez <[email protected]>

store remote agent card in database to preview agent information

79d1cc9

Signed-off-by: Fabian Gonzalez <[email protected]>

inFocus7 added 7 commits September 10, 2025 15:40

Merge branch 'main' into feat/remote-agent-type

3529d3c

minor code fixes - post merge

5422baa

Signed-off-by: Fabian Gonzalez <[email protected]>

Merge branch 'main' into feat/remote-agent-type

63c0f56

store metadata for remote agents (e.g. token usage) and support stric…

699cbd7

…t variety of token usage storage Signed-off-by: Fabian Gonzalez <[email protected]>

add (failing) remote agent testing

6205b9e

Signed-off-by: Fabian Gonzalez <[email protected]>

read x-user-id in e2e remote agent setup

71305ca

Signed-off-by: Fabian Gonzalez <[email protected]>

fix up remote agent e2e test + simplify recording manager to simply s…

f3f6dad

…tore task information from remote agents, instead of handling its own Signed-off-by: Fabian Gonzalez <[email protected]>

inFocus7 marked this pull request as ready for review September 16, 2025 16:14

inFocus7 requested a review from EItanya as a code owner September 16, 2025 16:14

Copilot AI review requested due to automatic review settings September 16, 2025 16:14

inFocus7 requested review from ilackarms, peterj and yuval-k as code owners September 16, 2025 16:14

Copilot AI reviewed Sep 16, 2025

View reviewed changes

inFocus7 added 6 commits September 16, 2025 14:10

remove comment + console log

4b2a73c

Signed-off-by: Fabian Gonzalez <[email protected]>

update Makefile to push remote Agent for e2e test

ecaff46

Signed-off-by: Fabian Gonzalez <[email protected]>

add todo

b5a47a9

Signed-off-by: Fabian Gonzalez <[email protected]>

Merge branch 'main' into feat/remote-agent-type

1ad4330

remove unecessary messageHandler additions

5b38f8c

Signed-off-by: Fabian Gonzalez <[email protected]>

update e2e remote test + clean up messageHandlers.ts

e3598a9

Signed-off-by: Fabian Gonzalez <[email protected]>

inFocus7 commented Sep 17, 2025

View reviewed changes

ui/src/lib/messageHandlers.ts Show resolved Hide resolved

add message case with basic log

9a645ad

Signed-off-by: Fabian Gonzalez <[email protected]>

EItanya reviewed Sep 17, 2025

View reviewed changes

inFocus7 and others added 7 commits September 17, 2025 18:14

pr feedback: get pre-existing session before creating + remove dbClie…

9578013

…nt check Signed-off-by: Fabian Gonzalez <[email protected]>

match session check logic between OnSendMessage and OnSendMessageStream

ffb208f

Signed-off-by: Fabian Gonzalez <[email protected]>

move network request for remote agent from translator to reconciler

6674a25

Signed-off-by: Fabian Gonzalez <[email protected]>

support remote-agent-as-tool

aa2c5f1

Signed-off-by: Fabian Gonzalez <[email protected]>

simplify remote definition to remove url overwrite - resolves a2a iss…

808db25

…ues as tool Signed-off-by: Fabian Gonzalez <[email protected]>

wip

594c639

Signed-off-by: Peter Jausovec <[email protected]>

peterj force-pushed the feat/remote-agent-type branch from 51d1d2e to 15801b4 Compare September 18, 2025 23:39

peterj added 2 commits September 18, 2025 18:41

wip

3b0c818

Signed-off-by: Peter Jausovec <[email protected]>

fix issue with creating remote agents through the ui

ccffd74

Signed-off-by: Peter Jausovec <[email protected]>

peterj mentioned this pull request Sep 19, 2025

usage for remote agent is not showing up #914

Open

peterj and others added 2 commits September 19, 2025 13:58

set the agent name, ui fixes

43f6c7b

Signed-off-by: Peter Jausovec <[email protected]>

fix: remove external remote translation

f5e1a48

Signed-off-by: Eitan Yarmush <[email protected]>

inFocus7 commented Sep 22, 2025

View reviewed changes

(Feat) Implement remote Agent type #821

Are you sure you want to change the base?

(Feat) Implement remote Agent type #821

Uh oh!

Conversation

inFocus7 commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes

TODO/Unknown

Follow-Ups

Updates

Uh oh!

inFocus7 commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

EItanya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

EItanya Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

inFocus7 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EItanya Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

EItanya Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

inFocus7 Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EItanya Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

inFocus7 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

inFocus7 Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

inFocus7 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

ae12345678910 commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

(Feat) Implement `remote` Agent type #821

(Feat) Implement `remote` Agent type #821

inFocus7 commented Aug 27, 2025 •

edited

Loading

inFocus7 commented Sep 10, 2025 •

edited

Loading

inFocus7 Sep 18, 2025 •

edited

Loading

inFocus7 Sep 17, 2025 •

edited

Loading