Skip to content

Conversation

@skshetry
Copy link
Collaborator

@skshetry skshetry commented Nov 1, 2025

Paramiko runs methods from ServerInterface in its own thread. The check_channel_request is called when a new channel is requested, before the channel is actually created and returned by transport.accept(). After the request is granted by check_channel_request, the channel is created and returned by transport.accept().

See
https://docs.paramiko.org/en/stable/api/server.html#paramiko.server.ServerInterface.check_channel_request.

There is sometimes a race where check_channel_exec_request is invoked around the same time that transport.accept() returns a channel, to Handler.run().

Both race to create the command queue for the channel. When Handler.run() wins, the command added to the queue by check_channel_exec_request is lost into the void. The client receives no response and hangs indefinitely, freezing the test suite.

To fix this, we could either add a lock around both places where we create the command queue, or we could just create the command queue in check_channel_request, which always run before Transport.accept() returns.

I chose the latter, as it's simpler and avoids the need for locks.


This patch has been tested on fsspec/sshfs#66 (please ignore other changes, I am trying to fix another test at the same time), where I ran >200 test jobs.
We were seeing test freezing intermittently, and when I investigated, I discovered this issue.

See https://github.com/fsspec/sshfs/actions/runs/18935359445/job/54060844842#step:5:22 for instance.

Paramiko runs methods from `ServerInterface` in its own thread.
The `check_channel_request` is called when a new channel is requested,
before the channel is actually created and returned by
`transport.accept()`. After the request is granted by `check_channel_request`,
the channel is created and returned by `transport.accept()`.

See
https://docs.paramiko.org/en/stable/api/server.html#paramiko.server.ServerInterface.check_channel_request.

There is sometimes a race where `check_channel_exec_request`
is invoked around the same time that `transport.accept()` returns a channel,
to `Handler.run()`.

Both race to create the command queue for the channel.
When `Handler.run()` wins, the command added to the queue by
`check_channel_exec_request` is lost into the void.
The client receives no response and hangs indefinitely, freezing the test suite.

To fix this, we could either add a lock around both places where we create
the command queue, or we could just create the command queue in
`check_channel_request`, which always run before `Transport.accept()`
returns.

I chose the latter, as it's simpler and avoids the need for locks.
@skshetry skshetry changed the title initialize command_queue on check_channel_request to avoid race initialize command_queue on check_channel_request to avoid race Nov 1, 2025
Comment on lines +17 to +18
include:
- os: ubuntu-22.04
Copy link
Collaborator Author

@skshetry skshetry Nov 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu-latest is now ubuntu-24.04, which does not support Python 3.7. See:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant