TQ: Implement prepare and commit for initial config #8682

andrewjstone · 2025-07-24T16:35:59Z

Initial configurations can be prepared and committed with the implemented handlers. This is tested along with aborts at Nexus for when the coordinator for the initial configuration has crashed in a new property based test.

The new property based test runs all possible nodes in the universe as the system under test (SUT), rather than running only the coordinator. This allows a full deterministic simulation of the protocol and checking of invariants at all nodes. It's also easier to write and understand as we don't have to capture and mock replies to the coordinator. I had always intended to write this test, but started with modelling the coordinator first since I thought it would be easier to incrementally build the protocol that way. However, it appears just as easy to incrementally build with all nodes as the SUT.

The new test does not have a model of the system, which is exceedingly hard to do for such a protocol. Instead the test checks invariants of the real state of the SUT after every action, and allows peppering in postconditions as necessary for each action or operation.

The Node API has also changed to not worry about time at all, and instead deals in terms of connections and disconnections. This makes for simpler code IMO, and matches what was done for LRTQ. We always are operating over sprockets streams, which run over TLS over TCP and so it makes little sense to model things as if arbitrary packets can get dropped and reordered.

As a result of the new proptest and the change in time usage, I've decided to drop the coordinator test altogether. It's too complicated for its value add and urgency is a priority.

andrewjstone · 2025-07-24T16:50:59Z

trust-quorum/src/node.rs

+            //
+            // Nexus should only attempt to commit nodes that have acknowledged
+            // a `Prepare`. The most likely reason that this has occurred
+            // is that the node has lost its state on the M.2 drives. It can


I realized that recovery is not actually guaranteed here, as the drive could have been wiped after acking the latest configuration but not yet rotating. A new configuration could have then been issued that doesn't contain the encrypted rack secret for the unrotated keys' epoch. I think that this is a rare scenario, but I also think we probably don't need to handle byzantine failure here. Instead, this should probably be an alarm state (similar to what was done in #8062) and support call if the data on the M.2s is gone.

Initial configurations can be prepared and committed with the implemented handlers. This is tested along with aborts at Nexus for when the coordinator for the initial configuration has crashed in a new property based test. The new property based test runs all possible nodes in the universe as the system under test (SUT), rather than running only the coordinator. This allows a full deterministic simulation of the protocol and checking of invariants at all nodes. It's also easier to write and understand as we don't have to capture and mock replies to the coordinator. I had always intended to write this test, but started with modelling the coordinator first since I thought it would be easier to incrementally build the protocol that way. However, it appears just as easy to incrementally build with all nodes as the SUT. The new test does not have a model of the system, which is exceedingly hard to do for such a protocol. Instead the test checks invariants of the real state of the SUT after every action, and allows peppering in postconditions as necessary for each action or operation. The Node API has also changed to not worry about time at all, and instead deals in terms of connections and disconnections. This makes for simpler code IMO, and matches what was done for LRTQ. We always are operating over sprockets streams, which run over TLS over TCP and so it makes little sense to model things as if arbitrary packets can get dropped and reordered. As a result of the new proptest and the change in time usage, I've decided to drop the coordinator test altogether. It's too complicated for its value add and urgency is a priority.

Builds upon #8682 This PR implements the ability to reconfigure the trust quorum after a commit. This includes the ability to fetch shares for the most recently committed configuration to recompute the rack secret and then include that in an encrypted form in the new configuration for key rotation purposes. The cluster proptest was enhanced to allow this, and it generates enough races - even without crashing and restarting nodes that it forced the handling of `CommitAdvance` messages to be implemented. This implementation includes the ability to construct key shares for a new configuration when a node misses a prepare and commit for that configuration. This required adding a `KeyShareComputer` which collects key shares for the configuration returned in a `CommitAdvance` so that it can construct its own key share and commit the newly learned configuration. Importantly, constructing a key share and coordinating a reconfiguration are mutually exclusive, and so a new invariant was added to the cluster test. We also start keeping track of expunged nodes in the cluster test, although we don't yet inform them that they are expunged if they reach out to other nodes. There are a few places in the code where a runtime invariant is violated and an error message is logged. This always occurs on message receipt and we don't want to panic at runtime because of an errant message and take down the sled-agent. However, we'd like to be able to report these upstream. The first step here is to be able to report when these situations are hit and put the node in an `Alarm` state such that it is stuck until remedied via support. We should *never* see an Alarm in practice, but since the states are possible to reach, we should manage them appropriately. This will come in a follow up PR and be similar to what I implemented in #8062.

andrewjstone requested a review from sunshowers July 24, 2025 16:35

andrewjstone commented Jul 24, 2025

View reviewed changes

andrewjstone force-pushed the tq-commit-and-prepare-ack branch from 771cde7 to 071f1cf Compare July 24, 2025 18:11

andrewjstone mentioned this pull request Jul 31, 2025

TQ: Add support for reconfiguration #8741

Open

andrewjstone requested a review from plotnick July 31, 2025 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TQ: Implement prepare and commit for initial config #8682

TQ: Implement prepare and commit for initial config #8682

Uh oh!

andrewjstone commented Jul 24, 2025

Uh oh!

andrewjstone Jul 24, 2025

Uh oh!

Uh oh!

TQ: Implement prepare and commit for initial config #8682

Are you sure you want to change the base?

TQ: Implement prepare and commit for initial config #8682

Uh oh!

Conversation

andrewjstone commented Jul 24, 2025

Uh oh!

andrewjstone Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!