Optimize the `OptimizeCliffordT` transpiler pass. #14996

alexanderivrii · 2025-09-08T08:37:28Z

Summary

In #14433 we added an extremely naive Clifford+T/Tdg optimization pass that aims to reduce the total number of T/Tdg-gates in a Clifford+T/Tdg circuit by combining consecutive pairs of T-gates into S-gates and consecutive pairs of Tdg-gates into Sdg-gates. This PR completely replaces this by a much better algorithm which is furthermore implemented in Rust. We also believe the algorithm is exact (that is, produces the minimum number of T/Tdg-gates).

The idea comes from discussions with Shelly, Julien, Ali and Simon. In essence, we apply the Litinski transform to 1-qubit sequences of Clifford+T/Tdg gates. We iteratively process the gates in the sequence. At each point we have a running list R of $\pm \pi/8$ rotations and a trailing Clifford operator C. When we encounter a new Clifford gate, we simply merge it into C. When we encounter a new T or Tdg-gate, we convert it into an RZ-rotation (while keeping track of the global phase) and swap this rotation with C (this does not change C but may change the axis of the rotation). We append this rotation to R and then check if it can be combined with the previous rotation in R. As an example, if the last rotation in R is an $RX(\pi/8)$-rotation and we are appending an $RX(-\pi/8)$-rotation, then the two rotations simply cancel out. On the other hand, if the last rotation in R is an $RX(\pi/8)$-rotation and we are appending another $RX(\pi/8)$-rotation, then the two rotations can be combined into a Clifford gate and then merged into C. After we process every gate, we rewrite rotations in R using Clifford+T/Tdg gates and express C in terms Clifford gates.

Details and comments

In the above algorithm we need to reason about operators that can be constructed using 1q-Clifford gates. Unlike the other Clifford classes, we need to keep track of the global phase. This leads to $192 = 24 \times 8$ possible operators, corresponding to $24$ single-qubit Cliffords multiplied by a factor of $e^{\pi k i/4}$, $k=0,\dots,7$. To reason about what happens when a Clifford gate is appended or prepended to such a Clifford operator, we have precomputed the tables for appending/prepending H and S-gates. Similarly, we have precomputed tables for evolving RX, RY, RZ rotations using such a Clifford operator. This leads to a fast but somewhat ugly implementation. @ShellyGarion is investigating if we can replace these precomputed tables by an explicit construction.

@ajavadia has Python code that also reimplements the exact resynthesis of Clifford+T/Tdg circuits. In fact, Ali's code has both 1-qubit and multiple-qubit versions, but here I am looking at the 1-qubit one. On the following example,

circuit = QuantumCircuit(1)
for i in range(10000):
    circuit.t(0)
    circuit.compose(random_clifford(1, seed=i*23+17).to_circuit(), [0], inplace=True)

both Ali's code and the code in this PR reduce the number of $T$-gates from 10000 to 3288. However, the implementation in this PR is about 800x faster, taking 0.0063 seconds compared to 5.2329 seconds).

At this point the pass is still very naive and only cancels pairs of adjacent T-gates and pairs of adjacent Tdg-gates. A change in behavior: the pass raises an error if the circuit has non-(Clifford+T) gates.

The optimization applies to sequences of 1-qubit Clifford+T/Tdg gates. We believe that for 1-qubit circuits we get optimal T-counts.

…named Paulis

Co-authored-by: Shelly Garion <[email protected]>"

qiskit-bot · 2025-09-08T08:37:34Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core

Co-authored-by: Shelly Garion <[email protected]>

coveralls · 2025-09-08T10:07:47Z

Pull Request Test Coverage Report for Build 17572998592

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

216 of 245 (88.16%) changed or added relevant lines in 4 files are covered.
18 unchanged lines in 5 files lost coverage.
Overall coverage increased (+0.002%) to 88.376%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
crates/transpiler/src/passes/optimize_clifford_t.rs	211	240	87.92%

Files with Coverage Reduction	New Missed Lines	%
crates/circuit/src/parameter/parameter_expression.rs	1	82.79%
crates/circuit/src/parameter/symbol_expr.rs	1	73.15%
qiskit/transpiler/passes/layout/vf2_utils.py	1	93.71%
crates/qasm2/src/lex.rs	3	91.75%
qiskit/transpiler/passes/layout/vf2_post_layout.py	12	91.12%

Totals
Change from base Build 17500557721:	0.002%
Covered Lines:	92406
Relevant Lines:	104560

💛 - Coveralls

mtreinish

I haven't reviewed this in depth yet, but from a quick skimming one thing stuck out to me about all the constant arrays that I left an inline comment on. It was a small thing but that would potentially impact performance so I wanted to mention it before giving a full review.

mtreinish · 2025-09-08T13:23:22Z

crates/transpiler/src/passes/optimize_clifford_t.rs

+// Precomputed tables used in the algorithm.
+
+// Index of the Clifford1q operator -> corresponding Clifford circuit
+const CIRCUIT: &[&[StandardGate]; 24] = &[


Typically I'd expect all of this arrays to be:

Suggested change

const CIRCUIT: &[&[StandardGate]; 24] = &[

static CIRCUIT: [[StandardGate]; 24] = [

instead of const slices. The difference in practice is the const version is basically a compiler directive that inlines the value everywhere it's used. While the static is set to a single address in memory that is loaded with the binary. Normally for arrays like this it's more efficient to use a static because with a const if you're using it multiple times or in a loop it's basically like doing:

let mut idx = 0; let mut val = 0; loop { let foo = [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z]; val += foo[i]; idx += 1; if idx > foo.len() { break; } }

every time you access these. If you're only using it once then it probably doesn't matter though. Its worth benchmarking it to be sure of course, but this why in other places with arrays like this they're typically defined as statics not const slices.

thanks! I did not understand this difference before

I have changed const to static in 673580f. In particular, the CIRCUIT array is now of the form

static CIRCUIT: [&[StandardGate]; 24] = [&[], &[StandardGate::H], ...]

I was not able to get rid of the inner slices though, as different entries consist of different numbers of gates.

Oh, on the circuit from this PR's summary this had absolutely no effect on performance.

ShellyGarion · 2025-09-09T14:52:53Z

crates/transpiler/src/passes/optimize_clifford_t.rs

+    &[StandardGate::S],
+    &[StandardGate::H, StandardGate::S],
+    &[StandardGate::S, StandardGate::H],
+    &[StandardGate::S, StandardGate::H, StandardGate::S],


A minor comment:
I think that for the 6 rep's for the Cliffords, it might be better to choose:
[I, H, S, HS, SdgH, SHS]
since they give exactly the following Cliffords:

Clifford: Stabilizer = ['+Z'], Destabilizer = ['+X'] Clifford: Stabilizer = ['+X'], Destabilizer = ['+Z'] Clifford: Stabilizer = ['+Z'], Destabilizer = ['+Y'] Clifford: Stabilizer = ['+Y'], Destabilizer = ['+Z'] Clifford: Stabilizer = ['+X'], Destabilizer = ['+Y'] Clifford: Stabilizer = ['+Y'], Destabilizer = ['+X']

You chose SH instead of SdgH, which gives "-" instead of "+":
Clifford: Stabilizer = ['+X'], Destabilizer = ['-Y']

The tables should be updated accordingly, but they should be more symmetric.

ajavadia · 2025-09-10T06:42:19Z

This looks good! I was initially a bit confused about what your code does but now I understand it so I will comment my understanding here. Feel free to use any part in the docstrings.

This pass can be run as a peephole optimization pass on a circuit written over the "Clifford+T" gateset. More precisely it collapses all chains of 1-qubit gates containing Clifford+T/Tdg into a minimal usage of T/Tdg. For a chain containing m gates, the runtime is O(m).

Linear-time complexity comes from the fact that in the special case of one-qubit gates, there will be no commutation opportunities between two rotations unless there is also a merge/cancel opportunity. This is no longer true for multi-qubit rotations because we must potentially consider repeated commutations until we find a rotation to merge/cancel with (see Zhang's algorithm arXiv:1903.12456).

Optimality comes from the fact that again in the special case of 1-qubit gates if we have a gate sequence such that no two consecutive rotations commute, then the sequence is optimal. This is because a chain of m rotations of this form necessarily has smallest denominator exponent (sde) in its channel representation equal to m, and we know sde is equal to the optimal T count, see arXiv:1308.4134. Since our algorithm ensures that no consecutively commuting rotations remain in the circuit (by merging/cancelling), the circuit it produces is optimal.

alexanderivrii added 8 commits September 7, 2025 14:35

Porting the CliffordT optimization pass to Rust.

6047fb5

At this point the pass is still very naive and only cancels pairs of adjacent T-gates and pairs of adjacent Tdg-gates. A change in behavior: the pass raises an error if the circuit has non-(Clifford+T) gates.

moving optimization code to a seperate functions

f2f8b91

Reimplementing Clifford+T optimization algorithm.

ee6c3fd

The optimization applies to sequences of 1-qubit Clifford+T/Tdg gates. We believe that for 1-qubit circuits we get optimal T-counts.

using StandardGate::X instead of X, as not to confuse with similarly-…

a9eec5f

…named Paulis

Improved comments and adding Shelly as a co-author.

fdba4f9

Co-authored-by: Shelly Garion <[email protected]>"

removing unused imports

0a6da8e

missing i in the comment

2244cc8

and removing 2

c89c0d1

alexanderivrii added this to the 2.3.0 milestone Sep 8, 2025

alexanderivrii requested a review from a team as a code owner September 8, 2025 08:37

alexanderivrii added this to Transpiler Sep 8, 2025

alexanderivrii added the Changelog: New Feature Include in the "Added" section of the changelog label Sep 8, 2025

github-project-automation bot moved this to To do in Transpiler Sep 8, 2025

This time properly adding Shelly as co-author

2e5ab0e

Co-authored-by: Shelly Garion <[email protected]>

ShellyGarion added the fault tolerance related to fault tolerance compilation label Sep 8, 2025

lint + avoid unnecessary collection

a59530f

ShellyGarion added the mod: transpiler Issues and PRs related to Transpiler label Sep 8, 2025

mtreinish reviewed Sep 8, 2025

View reviewed changes

alexanderivrii added 2 commits September 9, 2025 08:25

changing const to static

673580f

changing usize to u8

c673528

ShellyGarion reviewed Sep 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize the `OptimizeCliffordT` transpiler pass. #14996

Optimize the `OptimizeCliffordT` transpiler pass. #14996

Uh oh!

alexanderivrii commented Sep 8, 2025

Uh oh!

qiskit-bot commented Sep 8, 2025

Uh oh!

coveralls commented Sep 8, 2025 •

edited

Loading

Uh oh!

mtreinish left a comment

Uh oh!

mtreinish Sep 8, 2025

Uh oh!

alexanderivrii Sep 8, 2025

Uh oh!

alexanderivrii Sep 9, 2025 •

edited

Loading

Uh oh!

ShellyGarion Sep 9, 2025 •

edited

Loading

Uh oh!

ajavadia commented Sep 10, 2025

Uh oh!

Uh oh!

	const CIRCUIT: &[&[StandardGate]; 24] = &[
	static CIRCUIT: [[StandardGate]; 24] = [

Optimize the OptimizeCliffordT transpiler pass. #14996

Are you sure you want to change the base?

Optimize the OptimizeCliffordT transpiler pass. #14996

Uh oh!

Conversation

alexanderivrii commented Sep 8, 2025

Summary

Details and comments

Uh oh!

qiskit-bot commented Sep 8, 2025

Uh oh!

coveralls commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 17572998592

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

mtreinish left a comment

Choose a reason for hiding this comment

Uh oh!

mtreinish Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alexanderivrii Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alexanderivrii Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShellyGarion Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajavadia commented Sep 10, 2025

Uh oh!

Uh oh!

Optimize the `OptimizeCliffordT` transpiler pass. #14996

Optimize the `OptimizeCliffordT` transpiler pass. #14996

coveralls commented Sep 8, 2025 •

edited

Loading

alexanderivrii Sep 9, 2025 •

edited

Loading

ShellyGarion Sep 9, 2025 •

edited

Loading