Skip to content

Refactor: Externalize Scheduler's saturation logic and criticality-based service differentiation #805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LukeAVanDrie
Copy link
Contributor

@LukeAVanDrie LukeAVanDrie commented May 8, 2025

This commit refactors the request processing pipeline, externalizing saturation detection and criticality-based service differentiation from the Scheduler. These responsibilities are now primarily managed by the RequestControl.Director.

This change is a preparatory step for the introduction of a new Flow Controller component, which will eventually absorb these admission control duties.

Diff base is: #808 (split out for easier reviewing)
Related to: #674

Key changes include:

  • Introduced PreDispatch method to RequestControl.Director. It utilizes the SaturationDetector for admission control of non-critical requests and handles request criticality to determine if saturation checks are bypassed.
  • The saturation detection logic for dropping non-critical requests is intentionally preserved within the Director at this stage. This allows the option to bypass the future Flow Controller component during its maturation, ensuring the existing saturation and sheddable request behavior can be maintained as a fallback.
  • Simplified the Scheduler to focus solely on preference-based filtering and pod selection for requests that have already been admitted by the Director.
  • Removed the SheddableRequestFilter and the distinct critical/sheddable filter paths from the Scheduler's internal logic. The Scheduler now applies a single, unified preference filter chain to all incoming requests.
  • Updated main.go to instantiate the SaturationDetector, wiring it into the request handling flow.
  • Updated tests across scheduler_test.go, director_test.go, and filter_test.go to align with the new component responsibilities, adding additional coverage where necessary.

This refactoring leads to a cleaner architecture, making the Scheduler a more focused component and centralizing initial admission control logic, while paving the way for the future Flow Controller.

This is aligned with the direction in 0683-epp-architecture-proposal and should be nearly no-op in terms of EPP behavior.

Copy link

netlify bot commented May 8, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 4a7de3f
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/6822aa6ed948fd0008227048
😎 Deploy Preview https://deploy-preview-805--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LukeAVanDrie
Once this PR has been reviewed and has the lgtm label, please assign kfswain for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 8, 2025
@k8s-ci-robot k8s-ci-robot requested review from liu-cong and robscott May 8, 2025 20:26
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 8, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @LukeAVanDrie. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 8, 2025
@ahg-g
Copy link
Contributor

ahg-g commented May 8, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 8, 2025
@LukeAVanDrie
Copy link
Contributor Author

This change necessitates regression testing. It should be no-op, but I will get quantitative data to confirm that.

@LukeAVanDrie
Copy link
Contributor Author

LukeAVanDrie commented May 8, 2025

I split out the addition of the saturation detector subdir into a separate PR to be submitted before this one (#808 ). It is just unused until this PR gets submitted, wiring it up.

@LukeAVanDrie LukeAVanDrie force-pushed the saturation-detector branch from 112b943 to 48cc9a0 Compare May 8, 2025 20:51
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 8, 2025
@LukeAVanDrie LukeAVanDrie force-pushed the saturation-detector branch 3 times, most recently from a3d9090 to 9d273fa Compare May 9, 2025 02:49
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 9, 2025
@LukeAVanDrie LukeAVanDrie force-pushed the saturation-detector branch from 9d273fa to 83486ac Compare May 9, 2025 03:26
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 10, 2025
This commit refactors the request processing pipeline, externalizing
saturation detection and criticality-based service differentiation
from the Scheduler. These responsibilities are now primarily managed by
the RequestControl.Director.

This change is a preparatory step for the introduction of a new
Flow Controller component, which will eventually absorb these admission
control duties.

Key changes include:

- Introduced `PreDispatch` method to `RequestControl.Director` It
  utilizes the `SaturationDetector` for admission control of
  non-critical requests and handles request criticality to determine if
  saturation checks are bypassed.
- The saturation detection logic for dropping non-critical requests
  is intentionally preserved within the `Director` at this stage.
  This allows the option to bypass the future Flow Controller
  component during its maturation, ensuring the existing saturation
  and sheddable request behavior can be maintained as a fallback.
- Simplified the `Scheduler` to focus solely on preference-based
  filtering and pod selection for requests that have already been
  admitted by the `Director`.
- Removed the `SheddableRequestFilter` and the distinct
  critical/sheddable filter paths from the `Scheduler`'s internal logic.
  The `Scheduler` now applies a single, unified preference filter chain
  to all incoming requests.
- Updated `main.go` to instantiate the `SaturationDetector`, wiring it
  into the request handling flow.
- Updated tests across `scheduler_test.go`, `director_test.go`, and
  `filter_test.go` to align with the new component responsibilities,
  adding additional coverage where necessary.

This refactoring leads to a cleaner architecture, making the `Scheduler`
a more focused component and centralizing initial admission control logic,
while paving the way for the future Flow Controller.

This is aligned with the direction in `0683-epp-architecture-proposal`
and should be nearly no-op in terms of EPP behavior.
@LukeAVanDrie LukeAVanDrie force-pushed the saturation-detector branch from 83486ac to 4a7de3f Compare May 13, 2025 02:11
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2025
@k8s-ci-robot
Copy link
Contributor

@LukeAVanDrie: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gateway-api-inference-extension-test-unit-main 4a7de3f link true /test pull-gateway-api-inference-extension-test-unit-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 13, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants