Skip to content

Latest commit

 

History

History
122 lines (95 loc) · 7.84 KB

Execution.md

File metadata and controls

122 lines (95 loc) · 7.84 KB

Execution

The MsQuic API uses a different execution model compared to BSD-style sockets and most other networking libraries built on them. The sections below detail the designs MsQuic uses and the reasons behind these choices.

Event Model

MsQuic Object Model describes the hierarchy of MsQuic objects.

The MsQuic API delivers all state changes and notifications for a specific MsQuic object directly to the corresponding callback handler registered by the application. These include connection state changes, new streams being created, stream data being received, and stream sends completing, among others.

Example definition of Listener object callback API:

typedef struct QUIC_LISTENER_EVENT {
    QUIC_LISTENER_EVENT_TYPE Type;
    union {
        struct { ... } NEW_CONNECTION;
        struct { ... } STOP_COMPLETE;
        ...
    };
} QUIC_LISTENER_EVENT;

typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
_Function_class_(QUIC_LISTENER_CALLBACK)
QUIC_STATUS
(QUIC_API QUIC_LISTENER_CALLBACK)(
    _In_ HQUIC Listener,
    _In_opt_ void* Context,
    _Inout_ QUIC_LISTENER_EVENT* Event
    );

The application must register a callback handler for every MsQuic object it creates. This handler must manage all the events MsQuic may indicate for that object. The handler must also return a status for each event indicating to MsQuic how the event was handled. This returned status is often success/failure, but sometimes indicates MsQuic that further processing is required.

This approach differs significantly from sockets and most networking libraries, where the application must make a call (e.g., send or recv) to determine if something happened. This design choice was made for several reasons:

  • The MsQuic API runs in-process, eliminating the need for a kernel to user mode boundary switch to notify the application layer. This makes the callback-based design more practical compared to sockets.

  • The various events defined in MsQuic are derived from the underlying QUIC protocol. Applications may have hundreds of objects with potential state changes. The callback model allows the application to avoid synchronization/call management on each object and focus on event handling for the object.

  • Writing correct, scalable code in every application built on top of the socket interfaces is a repetetive, challenging task prone to errors. Offloading the threading and synchronization to MsQuic enables every application to be scalable with minimal effort, making things "just work" out of the box.

  • Simpler logic flow in MsQuic by eliminating a queue/cached state of yet to be delivered application notifications. This queue/cached state is maintained in the socket model to track yet-to-be-picked-up events/data and the networking stack must wait for call(s) from the application before indicating completion. This represents additional code, complexity and memory usage in socket model that MsQuic does without.

Writing Event Handlers

Event handlers are required for all objects that can receive events, as much of the MsQuic API operates through these callbacks. Critical events, such as "shutdown complete" notifications, provide vital information necessary for the application to function correctly. Without these events, the application cannot determine when it is safe to clean up objects.

Applications must keep the execution time within callbacks to a minimum. MsQuic does not use separate threads for protocol execution and upcalls to the application. Therefore, any significant delays in the callback will delay the protocol. Any substantial work required by the application must be performed on threads created by the application.

This does not imply that the application needs separate threads to perform all of its work. Many operations are designed to be most efficient when executed within the callback. For example, closing a handle to a connection or stream is ideally done during the "shutdown complete" event notification callback.

Some callbacks necessitate the application to call MsQuic API in return. Such cyclic call patterns could lead to deadlocks in a generic implementation, but not so in MsQuic. Special attention has been paid to ensure that MsQuic API (down) calls made from a callback thread always occur inline (thus avoiding deadlocks) and will take precedence over any calls in progress or queued from a separate thread. By default, MsQuic will never invoke a recursive callback to the application in these cases. The only exception to this rule is if the application opts in via the QUIC_STREAM_SHUTDOWN_FLAG_INLINE flag when calling StreamShudown on a callback.

Threading

MsQuic creates its own threads by default to manage the execution of its logic. The number and configuration of these threads depend on the settings passed to RegistrationOpen or QUIC_PARAM_GLOBAL_EXECUTION_CONFIG.

MsQuic typically creates a dedicated worker thread for each processor, which are hard-affinitized to a specific NUMA node and soft-affinitized (set as 'ideal processor') to a specific processor. Each of these threads handle both the datapath (i.e., UDP) and QUIC layers by default. QUIC may be configured to run these layers on separate threads. Using a single worker thread for both layers helps MsQuic can achieve lower latency and using separate threads for the two layers can help achieve higher throughput. MsQuic aligns its processing logic with the rest of the networking stack (including hardware RSS) to ensure that all processing stays on the same NUMA node, and ideally, the same processor.

The complexity of aligning processing across various threads and processors is the primary reason for MsQuic to manage its own threading. This provides developers with a performant abstraction of both functionality and threading model, which simplifies application development using MsQuic, ensuring that things "just work" efficiently for QUIC by default.

Each thread manages the execution of one or more connections. Connections are distributed across threads based on their RSS alignment, which should evenly distribute traffic based on different UDP tuples. Each connection and its derived state (i.e., streams) are managed and executed by a single thread at a time, but may move across threads to align with any RSS changes. This ensures that each connection and its streams are effectively single-threaded, including all upcalls to the application layer. MsQuic will never make upcalls for a single connection or any of its streams in parallel.

For listeners, the application callback will be called in parallel for new connections, allowing server applications to scale efficiently with the number of processors.

graph TD
    subgraph Kernel
        NIC-Queue1[NIC Queue]
        NIC-Queue2[NIC Queue]
        NIC-Queue1 -->|RSS Receive| UDP1[IP/UDP]
        NIC-Queue2 -->|RSS Receive| UDP2[IP/UDP]
    end
    subgraph MsQuic Process
        UDP1 -.-> Processor1
        UDP2 -.-> Processor2
        subgraph Processor1[Processor 0]
            Thread1[Thread]
            Thread1 -->|Manages| Connection1[Connection 1]
            Thread1 -->|Manages| Connection2[Connection 2]
            Connection1 -->|Delivers Event| ApplicationCallback1[App Callback]
            Connection2 -->|Delivers Event| ApplicationCallback2[App Callback]
        end
        subgraph Processor2[Processor 1]
            Thread2[Thread]
            Thread2 -->|Manages| Connection3[Connection 3]
            Connection3 -->|Delivers Event| ApplicationCallback3[App Callback]
        end
    end
Loading

See Also

QUIC_STREAM_CALLBACK
QUIC_STREAM_EVENT
QUIC_CONNECTION_CALLBACK
QUIC_CONNECTION_EVENT
QUIC_LISTENER_CALLBACK
QUIC_LISTENER_EVENT