Skip to content

Commit 319142b

Browse files
ci-penbot-01nikhilsk
authored andcommitted
[GPUOP] Adding docs for DCM Systemd integration (#851) (#852)
(cherry picked from commit 5389d38c6a9efbb7680ac74c28a874aef221728a) Co-authored-by: nikhilsk <[email protected]> Signed-off-by: yansun1996 <[email protected]>
1 parent bde22a6 commit 319142b

File tree

6 files changed

+113
-0
lines changed

6 files changed

+113
-0
lines changed

docs/dcm/applying-partition-profiles.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,9 @@ Below is an example of how to create the `config-manager-config.yaml` file that
102102
}
103103
]
104104
}
105+
},
106+
"gpuClientSystemdServices": {
107+
"names": ["amd-metrics-exporter", "gpuagent"]
105108
}
106109
}
107110
EOF

docs/dcm/device-config-manager-configmap.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ data:
4343
}
4444
]
4545
}
46+
},
47+
"gpuClientSystemdServices": {
48+
"names": ["amd-metrics-exporter", "gpuagent"]
4649
}
4750
}
4851
```
@@ -57,6 +60,7 @@ Below is an explanation of each field in the ConfigMap:
5760
| `computePartition` | Compute partition type |
5861
| `memoryPartition` | Memory partition type |
5962
| `numGPUsAssigned` | Number of GPUs to be partitioned on the node |
63+
| `gpuClientSystemdServices` | Defines a list of systemd service unit files to be stopped/restarted on the node |
6064

6165
```{note}
6266
Users can create a heterogeneous partitioning config profile by specifying more than one `computePartition` scheme in the `profiles` array, however this is not a recommmended or supported configuration by AMD. Note that NPS4 memory partition mode does not work with heterogenous parition schemes and only supports CPX on MI300X systems.
@@ -90,6 +94,9 @@ Users can create a heterogeneous partitioning config profile by specifying more
9094
}
9195
]
9296
}
97+
},
98+
"gpuClientSystemdServices": {
99+
"names": ["amd-metrics-exporter", "gpuagent"]
93100
}
94101
}
95102

docs/dcm/systemd_integration.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# DCM Systemd Integration
2+
3+
## Background
4+
5+
The Device Config Manager (DCM) orchestrates hardware-level tasks such as GPU partitioning. Before initiating partitioning, it gracefully stops specific systemd services defined in a configmap to prevent any processes (gpuagent, etc) from partition interference and ensure consistent device states
6+
7+
## K8S ConfigMap enhancement
8+
9+
The configmap contains a key "gpuClientSystemdServices" which declares the list of services to manage:
10+
11+
```yaml
12+
"gpuClientSystemdServices": {
13+
"names": ["amd-metrics-exporter", "gpuagent"]
14+
}
15+
```
16+
- These are the unit names (without the. service suffix) of systemd services related to GPU runtime agents. We add the suffix as a part of the code
17+
- Users can add/modify services to the above list
18+
19+
## ConfigMap
20+
21+
```yaml
22+
apiVersion: v1
23+
kind: ConfigMap
24+
metadata:
25+
name: config-manager-config
26+
namespace: kube-amd-gpu
27+
data:
28+
config.json: |
29+
{
30+
"gpu-config-profiles":
31+
{
32+
"cpx-profile":
33+
{
34+
"skippedGPUs": {
35+
"ids": []
36+
},
37+
"profiles": [
38+
{
39+
"computePartition": "CPX",
40+
"memoryPartition": "NPS4",
41+
"numGPUsAssigned": 8
42+
}
43+
]
44+
},
45+
"spx-profile":
46+
{
47+
"skippedGPUs": {
48+
"ids": []
49+
},
50+
"profiles": [
51+
{
52+
"computePartition": "SPX",
53+
"memoryPartition": "NPS1",
54+
"numGPUsAssigned": 8
55+
}
56+
]
57+
}
58+
},
59+
"gpuClientSystemdServices": {
60+
"names": ["amd-metrics-exporter", "gpuagent"]
61+
}
62+
}
63+
```
64+
65+
## Required Mounts for D-Bus & systemd Integration
66+
67+
| **Mount Name** | **Mount Path** | **Purpose** |
68+
|------------------------|------------------------|---------------------------------------------------------------------------|
69+
| `etc-systemd` | `/etc/systemd` | Access unit files for service definitions |
70+
| `run-systemd` | `/run/systemd` | Enables access to systemd runtime state |
71+
| `usr-lib-systemd` | `/usr/lib/systemd` | Required for systemd libraries and binaries |
72+
| `var-run-dbus` | `/var/run/dbus` | Allows DCM to communicate via system D-Bus (`system_bus_socket`) |
73+
74+
## Workflow
75+
76+
- DCM uses D-Bus APIs to query, stop, and restart systemd services programmatically, ensuring precise service orchestration.
77+
78+
- Extract Service List: On startup, DCM parses the configmap and retrieves the names array under gpuClientSystemdServices. Each entry is appended with (. service) to form full unit names.
79+
80+
- Capture Pre-State:
81+
- For each service:
82+
- It checks status using D-Bus via `org.freedesktop.systemd1.Manager.GetUnit.`
83+
- Stores current state (e.g. `active`, `inactive`, `not-loaded`) in PreStateDB.
84+
- This DB is used for restoring service state post-partitioning.
85+
86+
- Stop Services: Services are stopped gracefully using D-Bus APIs. This ensures they release GPU resources and don't disrupt the partitioning operation. We check if the service is present before stopping it using the CheckUnitStatus API.
87+
88+
- Perform Partitioning: Once services are stopped temporarily, DCM initiates the partitioning logic (using node labels/configmap profiles) and completes the partitioning workflow
89+
90+
- Restart & Restore State After partitioning:
91+
- DCM checks PreStateDB to determine which services were previously active.
92+
- Only those Services are restarted accordingly using the D-Bus invocation APIs.
93+
- Additionally, PreStateDB is cleared via a CleanupPreState() function to reset the tracker DB for the next run.
94+
95+
# Conclusion
96+
97+
- Avoids GPU contention during partitioning (device-busy errors aren’t seen during partition)
98+
- Maintains service continuity with minimal downtime

docs/sphinx/_toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ subtrees:
5858
- file: dcm/device-config-manager
5959
- file: dcm/device-config-manager-configmap
6060
- file: dcm/applying-partition-profiles
61+
- file: dcm/systemd_integration
6162
- caption: Specialized Networks
6263
entries:
6364
- file: specialized_networks/airgapped-install

docs/sphinx/_toc.yml.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ subtrees:
5858
- file: dcm/device-config-manager
5959
- file: dcm/device-config-manager-configmap
6060
- file: dcm/applying-partition-profiles
61+
- file: dcm/systemd_integration
6162
- caption: Specialized Networks
6263
entries:
6364
- file: specialized_networks/airgapped-install

example/configManager/config.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,5 +27,8 @@
2727
}
2828
]
2929
}
30+
},
31+
"gpuClientSystemdServices": {
32+
"names": ["amd-metrics-exporter", "gpuagent"]
3033
}
3134
}

0 commit comments

Comments
 (0)