initialize command_queue on check_channel_request to avoid race
#34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Paramiko runs methods from
ServerInterfacein its own thread. Thecheck_channel_requestis called when a new channel is requested, before the channel is actually created and returned bytransport.accept(). After the request is granted bycheck_channel_request, the channel is created and returned bytransport.accept().See
https://docs.paramiko.org/en/stable/api/server.html#paramiko.server.ServerInterface.check_channel_request.
There is sometimes a race where
check_channel_exec_requestis invoked around the same time thattransport.accept()returns a channel, toHandler.run().Both race to create the command queue for the channel. When
Handler.run()wins, the command added to the queue bycheck_channel_exec_requestis lost into the void. The client receives no response and hangs indefinitely, freezing the test suite.To fix this, we could either add a lock around both places where we create the command queue, or we could just create the command queue in
check_channel_request, which always run beforeTransport.accept()returns.I chose the latter, as it's simpler and avoids the need for locks.
This patch has been tested on fsspec/sshfs#66 (please ignore other changes, I am trying to fix another test at the same time), where I ran >200 test jobs.
We were seeing test freezing intermittently, and when I investigated, I discovered this issue.
See https://github.com/fsspec/sshfs/actions/runs/18935359445/job/54060844842#step:5:22 for instance.