Skip to content

Conversation

fmuyassarov
Copy link
Member

@fmuyassarov fmuyassarov commented Jul 7, 2025

This patch exports a label indicating the kubelet's configured swap behavior. By default, it reads from /var/lib/kubelet/config.yaml. If a custom kubelet config path is used, it can be set via the -kubelet-config-path flag.
If the swap behavior is not specified, NoSwap is assumed (matching kubelet's default).

Note: feature.node.kubernetes.io/memory-swap.behavior (or in general kubelet's memorySwap.swapBehavior) reflects whether k8s workloads are allowed to use swap. A node may have swap enabled, but if kubelet is set to NoSwap, pods cannot use it. As such, the behavior label will be exported only if node level swap is enabled.

      feature.node.kubernetes.io/memory-swap: "true"
      feature.node.kubernetes.io/memory-swap.behavior: LimitedSwap

Fixes: #2178

Copy link

netlify bot commented Jul 7, 2025

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit f5adaa8
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-nfd/deploys/68cc00c97fb88a00086f5eeb
😎 Deploy Preview https://deploy-preview-2192--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fmuyassarov
Once this PR has been reviewed and has the lgtm label, please assign marquiz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 7, 2025
@fmuyassarov
Copy link
Member Author

/test pull-node-feature-discovery-build-image-cross-generic

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmuyassarov thank you for the patch and working on this. Let's get this finalized.

My main comment is that in topology-updater we already read the kubelet config (read from the kubelet configz endpoint by default), helpers are in pkg/utils/kubeconf/. Could we do the same on nfd-worker?

@fmuyassarov
Copy link
Member Author

@fmuyassarov thank you for the patch and working on this. Let's get this finalized.

My main comment is that in topology-updater we already read the kubelet config (read from the kubelet configz endpoint by default), helpers are in pkg/utils/kubeconf/. Could we do the same on nfd-worker?

I will check that.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 8, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 8, 2025
@fmuyassarov
Copy link
Member Author

Hi @marquiz . I've reworked the patch to utilize the configz. The current state is missing the documentation update which I thought of adding once I get a green light that this the right approach. Please take a look.

@fmuyassarov fmuyassarov force-pushed the devel/swapehavior branch 3 times, most recently from 30fc56f to a75f334 Compare September 15, 2025 13:05
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @fmuyassarov. A few quick comments below.

The bigger question to me is how should we hierarchically place the new feature. The feature is related to swap, but specifically kubernetes/kubelet config setting. I could think e.g. a new system.kubelet or system.kubernetes feature. Then kubeletconfig config opts would not be command line flags for nfd-worker but config options for the source. And we would get robustness, nfd-worker would not refuse to start if it was not able to get the kubelet config

All in all, I'm not questioning the need for the feature. Just need to think what would be the most logical way to expose and implement it. Need to sleep over this.

// Add pod owner reference if it exists
if podName != "" {
if selfPod, err := w.k8sClient.CoreV1().Pods(w.kubernetesNamespace).Get(context.TODO(), podName, metav1.GetOptions{}); err != nil {
if selfPod, err := w.k8sClient.CoreV1().Pods(podNamespace).Get(context.TODO(), podName, metav1.GetOptions{}); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with w.kubernetesNamespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly from my test, if kubernetesNamespace is empty it will refuse because we are not using the clusterRole. But if I follow your comment earlier, keeping the Role as is and creating the ClusterRole only for the nodes/proxy, then we can keep w.kubernetesNamespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can expect this to be reverted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can expect this to be reverted?

yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marquiz It looks like we still need this. Following your earlier comment about using a ClusterRole only for nodes/proxy, I updated the code to create a ClusterRole and ClusterRoleBinding in addition to the existing Role and RoleBinding. However, during my test runs, since I’m not providing the KUBERNETES_NAMESPACE variable, it defaults to an empty value for w.kubernetesNamespace, which causes the following issue:

E0918 10:19:43.598034       1 nfd-worker.go:282] "failed to get self pod, cannot inherit ownerReference for NodeFeature" err="the server could not find the requested resource"
Tracking new pod rollout (nfd-worker-5m584):
     ┊ Scheduled       - <1s
     ┊ Initialized     - <1s
     ┃ Not Ready       - (ContainersNotReady): containers with unready status: [nfd-worker]
[event: pod node-feature-discovery/nfd-worker-5m584] Container image "ttl.sh/gcr.io_k8s-staging-nfd_node-feature-discovery:tilt-bc71b0b17aea3fb5" already present on machine
I0918 10:19:41.895350       1 nfd-worker.go:315] "Node Feature Discovery Worker" version="" nodeName="eseldb12u01" namespace=""
E0918 10:19:41.904694       1 nfd-worker.go:282] "failed to get self pod, cannot inherit ownerReference for NodeFeature" err="the server could not find the requested resource"
E0918 10:19:41.904738       1 main.go:73] "error while running" err="the server could not find the requested resource"
[event: pod node-feature-discovery/nfd-worker-5m584] Container image "ttl.sh/gcr.io_k8s-staging-nfd_node-feature-discovery:tilt-bc71b0b17aea3fb5" already present on machine
[event: pod node-feature-discovery/nfd-worker-5m584] Back-off restarting failed container nfd-worker in pod nfd-worker-5m584_node-feature-discovery(784862c3-3fbe-445a-8cc2-e7bee0675246)
Detected container restart. Pod: nfd-worker-5m584. Container: nfd-worker.
I0918 10:19:43.506399       1 nfd-worker.go:315] "Node Feature Discovery Worker" version="" nodeName="eseldb12u01" namespace=""
E0918 10:19:43.598034       1 nfd-worker.go:282] "failed to get self pod, cannot inherit ownerReference for NodeFeature" err="the server could not find the requested resource"
E0918 10:19:43.598101       1 main.go:73] "error while running" err="the server could not find the requested resource"
[event: pod node-feature-discovery/nfd-worker-5m584] Back-off restarting failed container nfd-worker in pod nfd-worker-5m584_node-feature-discovery(784862c3-3fbe-445a-8cc2-e7bee0675246)
[event: pod node-feature-discovery/nfd-worker-5m584] Back-off restarting failed container nfd-worker in pod nfd-worker-5m584_node-feature-discovery(784862c3-3fbe-445a-8cc2-e7bee0675246)
[event: pod node-feature-discovery/nfd-worker-5m584] Back-off restarting failed container nfd-worker in pod nfd-worker-5m584_node-feature-discovery(784862c3-3fbe-445a-8cc2-e7bee0675246)
[event: pod node-feature-discovery/nfd-worker-5m584] Container image "ttl.sh/gcr.io_k8s-staging-nfd_node-feature-discovery:tilt-bc71b0b17aea3fb5" already present on machine
[event: pod node-feature-discovery/nfd-worker-5m584] Back-off restarting failed container nfd-worker in pod nfd-worker-5m584_node-feature-discovery(784862c3-3fbe-445a-8cc2-e7bee0675246)
Detected container restart. Pod: nfd-worker-5m584. Container: nfd-worker.

But if we rely on the downward API to fetch its own namespace then we can avoid leaving w.kubernetesNamespace empty. Perhaps there are some issues of this approach that I'm unaware of, but I'm willing to discuss it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmuyassarov w.kubernetesNamespace is initialized with utils.GetKubernetesNamespace(). That, in turn, takes the value from /var/run/secrets/kubernetes.io/serviceaccount/namespace or if not found, fallbacks to KUBERNETES_NAMESPACE env variable. That should work. In normal deployments there should be no need to set the env variable but in tilt you might need to.

@ArangoGutierrez
Copy link
Contributor

/retest

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR exports a label indicating the kubelet's configured swap behavior for memory swap feature discovery. The change allows NFD to report whether Kubernetes workloads are allowed to use swap memory based on kubelet configuration.

  • Adds functionality to read kubelet configuration and extract swap behavior settings
  • Implements support for both file-based and API-based kubelet config access
  • Updates RBAC permissions to support accessing kubelet configuration endpoints

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
source/memory/memory.go Adds swap behavior detection and labeling functionality
pkg/nfd-worker/nfd-worker.go Implements kubelet config reading and swap behavior initialization
deployment/components/common/env.yaml Adds required environment variables for node address and pod namespace
deployment/base/rbac/worker-rolebinding.yaml Changes from Role to ClusterRole binding for broader permissions
deployment/base/rbac/worker-role.yaml Updates permissions to include nodes/proxy access for kubelet API
cmd/nfd-worker/main.go Adds command-line flags and default kubelet config URI construction

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +192 to 198
nfd = &nfdWorker{
kubeletConfigFunc: kubeletConfigFunc,
}

for _, o := range opts {
o.apply(nfd)
}

Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nfd variable is being reassigned to a new nfdWorker struct, which discards all the previous initialization including the stop channel and other fields set earlier in the function. This will cause the worker to lose its previous configuration.

Suggested change
nfd = &nfdWorker{
kubeletConfigFunc: kubeletConfigFunc,
}
for _, o := range opts {
o.apply(nfd)
}
nfd.kubeletConfigFunc = kubeletConfigFunc

Copilot uses AI. Check for mistakes.

Copy link
Contributor

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 comments

// Add pod owner reference if it exists
if podName != "" {
if selfPod, err := w.k8sClient.CoreV1().Pods(w.kubernetesNamespace).Get(context.TODO(), podName, metav1.GetOptions{}); err != nil {
if selfPod, err := w.k8sClient.CoreV1().Pods(podNamespace).Get(context.TODO(), podName, metav1.GetOptions{}); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can expect this to be reverted?

@marquiz
Copy link
Contributor

marquiz commented Sep 16, 2025

@fmuyassarov independent of where we logically put the new feature, I think the detection should be outside nfd-worker core package. If we keep it in memory.swap, then probably the feature name should somehow reflect that it's kubelet behavior.

@fmuyassarov
Copy link
Member Author

@fmuyassarov independent of where we logically put the new feature, I think the detection should be outside nfd-worker core package. If we keep it in memory.swap, then probably the feature name should somehow reflect that it's kubelet behavior.

The reason for splitting the detection between memory and worker pkgs is that, memory pkg doesn't have access to the kubeconfig & kubelet. Technically speaking, there is no need to do anything within the memory pkg. But, the reason I used memory pkg is because we are detecting the memory related feature. Sure, it is via kubelet, but the main thing is not kubelet, but the memory. Since we already had a logic of detecting the swap memory within the memory pkg, I thought of extending it for swap memory type too.

Signed-off-by: Feruzjon Muyassarov <[email protected]>
@k8s-ci-robot
Copy link
Contributor

@fmuyassarov: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-node-feature-discovery-verify-master f5adaa8 link true /test pull-node-feature-discovery-verify-master
pull-node-feature-discovery-e2e-test-master f5adaa8 link true /test pull-node-feature-discovery-e2e-test-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add swap detailed feature label
4 participants