Skip to content

Conversation

@ali-a-a
Copy link
Contributor

@ali-a-a ali-a-a commented Oct 25, 2025

Fixes #16185

Proposed Changes

Currently, in this test case, the synthetic delay for probing the pod is 250 ms, and the probe timeout is 300 ms. It makes the test flaky when the test runner is under load. We can decrease (or remove) the delay to deflake the test.

Also, to verify that it is the reason for flakiness, I ran some stress tests on TestProbePodIPs before and after the change:

250ms delay:

$ stress -p 256 ./net.test -test.run TestProbePodIPs
5s: 316 runs so far, 0 failures, 256 active

/tmp/go-stress-20251025T175448-1541118487
--- FAIL: TestProbePodIPs (0.30s)
    revision_backends_test.go:1862: TestProbePodIPs/one_pod_fails_probe: Healthy does not match, got map[10.10.1.3:{}], want map[10.10.1.2:{} 10.10.1.3:{}] diff:   sets.Set[string](
        - 	{"10.10.1.3": {}},
        + 	{"10.10.1.2": {}, "10.10.1.3": {}},
          )
FAIL


ERROR: exit status 1

100ms delay:

$ stress -p 256 ./net.test -test.run TestProbePodIPs
5s: 316 runs so far, 0 failures, 256 active
10s: 728 runs so far, 0 failures, 256 active
15s: 1227 runs so far, 0 failures, 256 active
20s: 1667 runs so far, 0 failures, 256 active
25s: 2136 runs so far, 0 failures, 256 active
30s: 2596 runs so far, 0 failures, 256 active
35s: 3052 runs so far, 0 failures, 256 active
40s: 3514 runs so far, 0 failures, 255 active
45s: 3969 runs so far, 0 failures, 256 active
50s: 4437 runs so far, 0 failures, 256 active
55s: 4892 runs so far, 0 failures, 256 active
1m0s: 5354 runs so far, 0 failures, 256 active
1m5s: 5821 runs so far, 0 failures, 256 active
1m10s: 6272 runs so far, 0 failures, 256 active
1m15s: 6733 runs so far, 0 failures, 256 active
1m20s: 7190 runs so far, 0 failures, 256 active

NOTE: I still don't know the purpose of using a delay in this test case. If it is not necessary, we can just remove it, as even 100 ms of delay can cause flakiness.

Release Note

NONE

@knative-prow
Copy link

knative-prow bot commented Oct 25, 2025

Welcome @ali-a-a! It looks like this is your first PR to knative/serving 🎉

@knative-prow knative-prow bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 25, 2025
@knative-prow
Copy link

knative-prow bot commented Oct 25, 2025

Hi @ali-a-a. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dprotaso
Copy link
Member

/ok-to-test

@knative-prow knative-prow bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 27, 2025
@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.06%. Comparing base (a2a2441) to head (9ec7891).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16201      +/-   ##
==========================================
+ Coverage   79.97%   80.06%   +0.09%     
==========================================
  Files         214      214              
  Lines       13281    13281              
==========================================
+ Hits        10621    10633      +12     
+ Misses       2299     2292       -7     
+ Partials      361      356       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dprotaso
Copy link
Member

/retest
/lgtm
/approve

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Oct 27, 2025
@knative-prow
Copy link

knative-prow bot commented Oct 27, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ali-a-a, dprotaso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2025
@dprotaso
Copy link
Member

/retest

@knative-prow knative-prow bot merged commit d5d624b into knative:main Oct 27, 2025
156 of 157 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[flaky] pkg/activator/net: TestProbePodIPs

2 participants