Skip to content

Add WG AI Integration #8519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions OWNERS_ALIASES
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ aliases:
- mfahlandt
- ritazh
- terrytangyuan
wg-ai-integration-leads:
- ardaguclu
- rushmash91
- zvonkok
wg-batch-leads:
- kannon92
- mwielgus
Expand Down
1 change: 1 addition & 0 deletions liaisons.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ members will assume one of the departing members groups.
| [SIG UI](sig-ui/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
| [SIG Windows](sig-windows/README.md) | Benjamin Elder (**[@BenTheElder](https://github.com/BenTheElder)**) |
| [WG AI Conformance](wg-ai-conformance/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
| [WG AI Integration](wg-ai-integration/README.md) | Paco Xu 徐俊杰 (**[@pacoxu](https://github.com/pacoxu)**) |
| [WG Batch](wg-batch/README.md) | Antonio Ojea (**[@aojea](https://github.com/aojea)**) |
| [WG Data Protection](wg-data-protection/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
| [WG Device Management](wg-device-management/README.md) | Benjamin Elder (**[@BenTheElder](https://github.com/BenTheElder)**) |
Expand Down
1 change: 1 addition & 0 deletions sig-api-machinery/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
## Working Groups

The following [working groups][working-group-definition] are sponsored by sig-api-machinery:
* [WG AI Integration](/wg-ai-integration)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-apps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
## Working Groups

The following [working groups][working-group-definition] are sponsored by sig-apps:
* [WG AI Integration](/wg-ai-integration)
* [WG Batch](/wg-batch)
* [WG Data Protection](/wg-data-protection)
* [WG Node Lifecycle](/wg-node-lifecycle)
Expand Down
1 change: 1 addition & 0 deletions sig-architecture/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ The Chairs of the SIG run operations and processes governing the SIG.

The following [working groups][working-group-definition] are sponsored by sig-architecture:
* [WG AI Conformance](/wg-ai-conformance)
* [WG AI Integration](/wg-ai-integration)
* [WG Device Management](/wg-device-management)
* [WG LTS](/wg-lts)
* [WG Serving](/wg-serving)
Expand Down
6 changes: 6 additions & 0 deletions sig-auth/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,12 @@ subprojects, and resolve cross-subproject technical issues and decisions.
- [@kubernetes/sig-auth-test-failures](https://github.com/orgs/kubernetes/teams/sig-auth-test-failures) - Test Failures and Triage
- Steering Committee Liaison: Patrick Ohly (**[@pohly](https://github.com/pohly)**)

## Working Groups

The following [working groups][working-group-definition] are sponsored by sig-auth:
* [WG AI Integration](/wg-ai-integration)


## Subprojects

The following [subprojects][subproject-definition] are owned by sig-auth:
Expand Down
1 change: 1 addition & 0 deletions sig-cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
## Working Groups

The following [working groups][working-group-definition] are sponsored by sig-cli:
* [WG AI Integration](/wg-ai-integration)
* [WG Node Lifecycle](/wg-node-lifecycle)


Expand Down
1 change: 1 addition & 0 deletions sig-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ When the need arises, a [new SIG can be created](sig-wg-lifecycle.md)
| Name | Label | Stakeholder SIGs |Organizers | Contact | Meetings |
|------|-------|------------------|-----------|---------|----------|
|[AI Conformance](wg-ai-conformance/README.md)|[ai-conformance](https://github.com/kubernetes/kubernetes/labels/wg%2Fai-conformance)|* Architecture<br>* Testing<br>|* [Janet Kuo](https://github.com/janetkuo), Google<br>* [Mario Fahlandt](https://github.com/mfahlandt), Kubermatic GmbH<br>* [Rita Zhang](https://github.com/ritazh), Microsoft<br>* [Yuan Tang](https://github.com/terrytangyuan), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-ai-conformance)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-ai-conformance)|* Regular WG Meeting: [Thursdays at 10:00 PT (Pacific Time) (weekly)]()<br>
|[AI Integration](wg-ai-integration/README.md)|[ai-integration](https://github.com/kubernetes/kubernetes/labels/wg%2Fai-integration)|* API Machinery<br>* Apps<br>* Architecture<br>* Auth<br>* CLI<br>|* [Arda Guclu](https://github.com/ardaguclu), Red Hat<br>* [Arush Sharma](https://github.com/rushmash91), Amazon<br>* [Zvonko Kaiser](https://github.com/zvonkok), NVIDIA<br>|* [Slack](https://kubernetes.slack.com/messages/wg-ai-integration)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-ai-integration)|* WG AI Integration Weekly Meeting: [Wednesdays at 9:30 PT (Pacific Time) (weekly)]()<br>
|[Batch](wg-batch/README.md)|[batch](https://github.com/kubernetes/kubernetes/labels/wg%2Fbatch)|* Apps<br>* Autoscaling<br>* Node<br>* Scheduling<br>|* [Kevin Hannon](https://github.com/kannon92), Red Hat<br>* [Marcin Wielgus](https://github.com/mwielgus), Google<br>* [Maciej Szulik](https://github.com/soltysh), Defense Unicorns<br>* [Swati Sehgal](https://github.com/swatisehgal), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-batch)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-batch)|* Regular Meeting ([calendar](https://calendar.google.com/calendar/embed?src=8ulop9k0jfpuo0t7kp8d9ubtj4%40group.calendar.google.com)): [Thursdays (starting February 15th 2024)s at 3PM CET (Central European Time) (monthly)](https://zoom.us/j/98329676612?pwd=c0N2bVV1aTh2VzltckdXSitaZXBKQT09)<br>
|[Data Protection](wg-data-protection/README.md)|[data-protection](https://github.com/kubernetes/kubernetes/labels/wg%2Fdata-protection)|* Apps<br>* Storage<br>|* [Xing Yang](https://github.com/xing-yang), VMware<br>* [Xiangqian Yu](https://github.com/yuxiangqian), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-data-protection)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-data-protection)|* Regular WG Meeting: [Wednesdays at 9:00 PT (Pacific Time) (bi-weekly)](https://zoom.us/j/6933410772)<br>
|[Device Management](wg-device-management/README.md)|[device-management](https://github.com/kubernetes/kubernetes/labels/wg%2Fdevice-management)|* Architecture<br>* Autoscaling<br>* Network<br>* Node<br>* Scheduling<br>|* [John Belamaric](https://github.com/johnbelamaric), Google<br>* [Kevin Klues](https://github.com/klueska), NVIDIA<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-device-management)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-device-management)|* Regular WG Meeting (Asia/Europe): [Wednesdays at 9:00 CET (Central European Time) (biweekly)](https://zoom.us/j/97238699195?pwd=cy9IMm1ZeERtRlJ3VS8yWUxHUWIrQT09)<br>* Regular WG Meeting (Europe/America): [Tuesdays at 8:30 PT (Pacific Time) (biweekly)](https://zoom.us/j/97238699195?pwd=cy9IMm1ZeERtRlJ3VS8yWUxHUWIrQT09)<br>
Expand Down
41 changes: 41 additions & 0 deletions sigs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3567,6 +3567,47 @@ workinggroups:
liaison:
github: pohly
name: Patrick Ohly
- dir: wg-ai-integration
name: AI Integration
mission_statement: >
The AI Integration Working Group focuses on enabling seamless integration of AI/ML
control planes with Kubernetes, as well as providing standardized patterns for
deploying, managing, and operating AI applications at scale on Kubernetes.

charter_link: charter.md
stakeholder_sigs:
- API Machinery
- Apps
- Architecture
- Auth
- CLI
label: ai-integration
leadership:
chairs:
- github: ardaguclu
name: Arda Guclu
company: Red Hat
email: [email protected]
- github: rushmash91
name: Arush Sharma
company: Amazon
email: [email protected]
- github: zvonkok
name: Zvonko Kaiser
company: NVIDIA
email: [email protected]
meetings:
- description: WG AI Integration Weekly Meeting
day: Wednesday
time: "9:30"
tz: PT (Pacific Time)
frequency: weekly
contact:
slack: wg-ai-integration
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-ai-integration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing:

liaison:
  github: 
  name: 

section filled in, and the actual liaison appointed. I'll sync with the rest of the steering and will return with a name to put here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added with TBD, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrunalp You can add me as the liaison of this workgroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pacoxu I have added you as liaison. Thanks :) cc: @soltysh

liaison:
github: pacoxu
name: Paco Xu 徐俊杰
- dir: wg-batch
name: Batch
mission_statement: >
Expand Down
8 changes: 8 additions & 0 deletions wg-ai-integration/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# See the OWNERS docs at https://go.k8s.io/owners

reviewers:
- wg-ai-integration-leads
approvers:
- wg-ai-integration-leads
labels:
- wg/ai-integration
39 changes: 39 additions & 0 deletions wg-ai-integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!---
This is an autogenerated file!

Please do not edit this file directly, but instead make changes to the
sigs.yaml file in the project root.

To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
--->
# AI Integration Working Group

The AI Integration Working Group focuses on enabling seamless integration of AI/ML control planes with Kubernetes, as well as providing standardized patterns for deploying, managing, and operating AI applications at scale on Kubernetes.

The [charter](charter.md) defines the scope and governance of the AI Integration Working Group.

## Stakeholder SIGs
* [SIG API Machinery](/sig-api-machinery)
* [SIG Apps](/sig-apps)
* [SIG Architecture](/sig-architecture)
* [SIG Auth](/sig-auth)
* [SIG CLI](/sig-cli)

## Meetings
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-integration) for the group will typically add invites for the following meetings to your calendar.*
* WG AI Integration Weekly Meeting: [Wednesdays at 9:30 PT (Pacific Time)]() (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9%3A30&tz=PT%20%28Pacific%20Time%29).

## Organizers

* Arda Guclu (**[@ardaguclu](https://github.com/ardaguclu)**), Red Hat
* Arush Sharma (**[@rushmash91](https://github.com/rushmash91)**), Amazon
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rushmash91 can you work on getting your membership to be able to meet the requirements?

Please note that all working group organizers and holders of other leadership roles must be community members.

https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md

Copy link
Member

@rushmash91 rushmash91 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Working on it, opened issue!

* Zvonko Kaiser (**[@zvonkok](https://github.com/zvonkok)**), NVIDIA

## Contact
- Slack: [#wg-ai-integration](https://kubernetes.slack.com/messages/wg-ai-integration)
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-ai-integration)
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fai-integration)
- Steering Committee Liaison: Paco Xu 徐俊杰 (**[@pacoxu](https://github.com/pacoxu)**)
<!-- BEGIN CUSTOM CONTENT -->

<!-- END CUSTOM CONTENT -->
105 changes: 105 additions & 0 deletions wg-ai-integration/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# WG AI Integration Charter

This charter adheres to the conventions described in the [Kubernetes Charter
README] and uses the Roles and Organization Management outlined in
[wg-governance].

## Scope

The AI Integration Working Group focuses on enabling seamless integration of
AI/ML control planes with Kubernetes, as well as providing standardized
patterns for deploying, managing, and operating AI applications at scale
on Kubernetes.
Comment on lines +9 to +12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AI Integration Working Group focuses on enabling seamless integration of
AI/ML control planes with Kubernetes

This makes sense to me: how can kubernetes best integrate with external AI systems.

providing standardized
patterns for deploying, managing, and operating AI applications at scale
on Kubernetes.

This looks very close to wg-serving. Trying to confirm, wg-ai-integration is about how external AI system can use kube and wg-serving is about how best to run AI workloads on kube, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serving is for running AI models on k8s. The AI workloads (such as agents) connect to LLM/AI APIs such as those run by projects in WG Service scope or externally hosted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wg-serving is specifically focusing on hardware-accelerated AI/ML inference, so that's that fine line between these two.


The Working Group will provide a forum for a broad engineering community to
give feedback to the project on challenges encountered when integrating with
Kubernetes.

This addresses a broad need with many end-users deploying complex AI systems,
AI/ML platform providers, Kubernetes distributions, and developers of
distributed AI applications facing these integration challenges. Standardizing
solutions in this space benefits the entire Kubernetes ecosystem. Adjacent
ecosystems could link to the outputs of this WG as a trusted vehicle for
supporting AI integrations with Kubernetes.

### In scope

* Develop a shared community point of view and associated best practices
enabling AI agent (or multi-agent) systems to integrate with Kubernetes.

* Provide a forum for intersecting code experimentation in AI integration
space and discussion with the existing Kubernetes community.

* Recommend an appropriate go forward governance model for AI Integrations
with the Kubernetes project.

* Identify appropriate auth(z) patterns for AI connector identities, its
closest caller, and Kubernetes RBAC.

* Defining benchmarks on pros/cons of design approaches to meet user outcomes.

* Ensure security, observability, and policy enforcement can be consistently
applied across integrated systems (K8s and external Control Planes such as
LLMs) and AI integration applications.

* Define potential enhancements to API conventions to scale AI integration
patterns that respect data privacy and safety concerns during our design
process. Consider alternative API patterns that could be a better fit for
AI enablement.

* Explore patterns for efficient network access to emergent protocols such
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soltysh Here is the point related to networking :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I've seen that, but still missed it 😅

as MCP/A2A via proxies or gateways.

* Reduce the complexity and custom development required for deploying,
building and managing connectors of kubernetes API with AI agent ecosystems.

### Out of Scope

* Development of AI/ML frameworks or applications
* General-purpose workload management not specific to AI/ML
* Deploying inference workloads on Kubernetes (which is covered by WG Serving)
* Manage accelerator devices (which is covered by WG Device Management)

## Deliverables

* The WG will provide space for collaboration and experimentation. If/when any
solid ideas emerge that require changes to Kubernetes (for example, updates
to kubectl for AI consumption), the WG will facilitate and coordinate the delivery
of KEPs and their implementations by the participating SIGs.
* Interim artifacts will include documents capturing use cases, requirements,
integration architecture designs, and AI application communication patterns.
* Establish best practices document for AI tool integration with Kubernetes and
a clear recommendation if/what set of reference tools may best fit in
Kubernetes project itself informed from data driven experimentation with
appropriate governance model.

## Stakeholders

* SIG Architecture
* SIG API Machinery
* SIG Apps
* SIG Auth
* SIG CLI
* SIG Network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big areas where SIG Network has been hearing about / dealing with AI are DRA (aka "Manage accelerator devices", which is marked out-of-scope) and inference-related Gateway features (which would seem to fall under "Deploying inference workloads", which is also marked out-of-scope, and which is the subject of a different AI WG proposal anyway). None of the things you list as "In scope" above seem like they need input from SIG Network.

@kubernetes/sig-network-leads ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One area where SIG Network could have input is for protocol (such as MCP) proxies or gateways.
I can add that to the list of areas to explore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer to Dan, I really not very familiar with these protocols you mention but it seems to me both WG-Serving and the AI-GW proposal Dan is indicating will overlap on that area

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I added an item for this in the charter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we still need sig-network here or not? And where is the updated point, I can't find it 😅

Copy link
Contributor

@MikeZappa87 MikeZappa87 Aug 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea what are your specific concerns here?

What are we currently doing with these WG

Are we actively involved? Should we reduce this list? I know dev-mgmt, however are we as sig-net still playing a daily role in this?

Copy link
Member

@aojea aojea Aug 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to answer, those WG touch areas from the SIG Network charter, node lifecycle with endpoints, device management with network endpoints, serving as inference ... we asked a fair question on what is the overlap with SIG network charter and the answer is A2A and MCP protocols that are the same as websocket, is not in SIG network charter... I dont have more interest than resolving a conversation, and I talked with mrunal in private to clarify... If any SIG network lead had approved without an unresolved conversation I would not have any objections, I trust other to make the call they think is good for the SIG ... But I found surprising to approve when the conversation was open and waiting for an answer

Copy link
Contributor

@MikeZappa87 MikeZappa87 Aug 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @shaneutt had a distinction between the current AI related working groups and the proposed. The question is fair, I am not contesting that, more asking for information on how to appropriately prioritize. We should have clear defined boundaries, however slight overlap isn't a bad thing at all. Shane is currently OOF, we should give him the benefit of the doubt this comment was missed? @shaneutt if you could address the concern from @danwinship? In the interest of transparency lets keep all conversations related to this PR here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain myself better, I'm totally in favor of this WG and I was on my way to +1 from my steering hat thinking this conversation was solved. This WG is going to be created with or without SIG Network, there are already 5 SIGs sponsoring it so let's not make a big drama of this

Now, from the SIG Network hat, what is the role of SIG Network here? and as Dan correctly pointed out, we already have another WG on this area and a proposal for a new WG #8519 (comment)

So, if you say SIG Network has a role here because of foo and does not intersect with any of the other WG then it is ok .... BUT at one point WGs want their things to get done and then go to the SIGs ... and I really want to avoid that in SIG Network we need to deal with conflicts of interest between WGs, because that burns people and breaks communities, people that get frustrated because they were working on the WG with one goal and people that has to say no or has to choose between competing implementation ... this mean that we didn't do our work as leads on reviewing the WG proposal, that is what we should do here, be objective and talk and discuss and agree and review thoroughly for the best of the project ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just dropped SIG Network from the list. I would leave it up to sig-network leads if/how they engage with the WG. Thanks!


## Roles and Organization Management

This working group adheres to the Roles and Organization Management outlined in
[wg-governance] and opts-in to updates and modifications to [wg-governance].

## Exit Criteria

The WG is done if/when a shared recommendation is in place for how the Kubernetes
project should or should not integrate with these emergent systems. This could
include a recommendation for Kubernetes to adopt and/or evolve tools (e.g. MCP
connectors, benchmark or environment validation tooling, etc.) and evolve its
own governance model to provide proper stewardship within the project or outside.

The working group will disband when the KEPs resulting from these discussions
have reached a terminal state. When the core functionality for AI workload
management reaches GA, we will evaluate whether the working group should
be disbanded and any remaining KEPs be left to the management of their owning
SIGs.

[wg-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md
[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md