-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Fix the client connection is closing issue in robustness test #20689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix the client connection is closing issue in robustness test #20689
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: henrybear327 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
Codecov Report✅ All modified and coverable lines are covered by tests. Please upload reports for the commit c77e1b7 to get more accurate results. Additional details and impacted filessee 21 files with indirect coverage changes @@ Coverage Diff @@
## main #20689 +/- ##
==========================================
- Coverage 69.14% 69.11% -0.03%
==========================================
Files 420 420
Lines 34817 34794 -23
==========================================
- Hits 24074 24049 -25
- Misses 9338 9344 +6
+ Partials 1405 1401 -4 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
/retest |
if err != nil { | ||
return err | ||
} | ||
defer c.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Close is called correctly. The ClientSet is implemented wrongly.
User stories:
- I want to ensure that client connections are cleaned up even if panic occurs.
defer c.Close()
should work. - I want reports generated by
c.Report()
to be complete. Client should be closed when generating report to prevent future changes. - Client.Close should not panic/error when I close it after getting Report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ok.
Let me look into it. Thanks for the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the root cause is now clear.
The way we implement openWatchPeriodically
is that we will take the RecordingClient
as a parameter, and continuously spawn watches in the go routines.
In openWatchPeriodically
, when we break due to context is done or we receive something from the finish channel, we will exit the function. But the go routines that we have created are still executing, so there would be a possibility that the shared RecordingClient is closed outside of openWatchPeriodically
(due to defer
), but the Get
(or whatever calls) from within the goroutine is still being made (before the go routine reaches its exit checkpoint), thus, causing the grpc the client connection is closing
error, as we might be using closed RecordingClient!
The fix I have put up so far is not ... pretty though. Basically, locking the RecordingClient with the kvMux
for all actions. Maybe there is a better idea here :( @serathius (code here is PoC)
336739c
to
ee854c0
Compare
478d9e1
to
4bbfb83
Compare
4bbfb83
to
deda3df
Compare
deda3df
to
4c25b7d
Compare
c5b3be5
to
67c992b
Compare
@henrybear327: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
} | ||
watch := c.Watch(ctx, "", lastRevision+1, true, true, false) | ||
if watch == nil { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. Please read how this loop exists.
return c.client.Endpoints() | ||
} | ||
|
||
func (c *RecordingClient) Watch(ctx context.Context, key string, rev int64, withPrefix bool, withProgressNotify bool, withPrevKV bool) clientv3.WatchChan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please return error
c.kvMux.Lock() | ||
defer c.kvMux.Unlock() | ||
if c.isClosed { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong too.
c.kvMux.Lock() | ||
defer c.kvMux.Unlock() | ||
if c.isClosed { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't return nil as it breaks contract. Instead implement interface that returns error when we execute Txn.
Signed-off-by: Chun-Hung Tseng <[email protected]>
67c992b
to
c77e1b7
Compare
We should let the
ClientSet
handle the client closure.Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.