Skip to content

Proposal: provide error codes when closing connections and resetting streams #479

Open
@marten-seemann

Description

@marten-seemann
### Related PRs
- [ ] #623 
- [ ] #622 

It would be really helpful to know why a peer closed a connection or reset a stream. Unfortunately, we currently don’t have access to that information.

Here’s a proposal how to convey that piece of information.

Connection Termination

Current situation:

  • QUIC: uses a CONNECTION_CLOSE frame, which carries a 62 bit error code and a human-readable message (a string limited by the MTU).
  • WebTransport: uses a CLOSE_WEBTRANSPORT_SESSION capsule, which carries a 32 bit error code and a human-readable message (up to 8k)
  • yamux: has a GOAWAY frame that abuses the length field (32 bit) to carry an error code. Currently the spec only defines 3 distinct error codes
  • mplex: don’t care

It seems straightforward to use a 32 bit error code space for libp2p. If we decide that transmitting an error message is important, we might be able to find a backwards-compatible yamux hack, similar to the one described in the next section.

We could have different error codes for: connections that are closed because they were dial-raced with other connections, disallowed by a connection gater, closed due to resource limitations, closed to make room for more valuable connections, closed for different kinds of protocol violations, etc.

Caveat: With TCP linger set to 0, the TCP connection is reset instead of properly closed. This also means that the error code might not be transmitted reliably.

Stream Termination

Current situation:

  • QUIC: the RESET_STREAM frames contains a 62 bit error code field (there are no human-readable messages for stream resets)
  • WebTransport: limits stream reset error codes to 8 bits
  • yamux: doesn’t allow transmitting any error code
  • mplex: still don’t care

It seems like we’re therefore limit to 256 error codes. We’d need to reserve a subset of these for libp2p itself (for example, we need to convey that multistream negotiation failed, or that we didn’t even start multistream negotiation because of resource limits, etc.). The rest of the error codes would be defined by the application.

yamux hack

Depending on how current implementations handle this, we could either:

  • if implementations ignore data sent on frames that have the RESET flag set: attach the error code to that frame
  • if implementations ignore stream data received on a stream that was set (I think go does): send the error code in a stream frame

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions