-
Notifications
You must be signed in to change notification settings - Fork 2
rfc: collect and provide structured benchmark programs #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
rfc: collect and provide structured benchmark programs #32
Conversation
burgholzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @DRovara 👋🏼
Thanks for kicking this off! Great to see the first RFC ;-)
I went through the document top-to-bottom once and accumulated some comments. Most of them are pretty minor and should be fairly easy to address.
I have one more general comment or request for changes:
Would it make sense to define a list of "defining features" for a structured program and then create a table for the benchmarks that highlights which features a particular program uses?
Something like:
- Loops with compile-time bounds
- Loops with runtime bounds (true
whileloops) - Dynamic qubit indexing (within loops)
- Conditional quantum instructions (e.g., depending on mid-circuit measurement results)
- Qubit reuse (e.g., via
resetinstructions) - Dynamic qubit allocation
This might make it a little easier to handle the fact that it's hard to put some of the algorithms into a single category.
The table could also include a column with a short description and a column for references to each algorithm as well as, potentially, a checkbox that could be ticked once the algorithm is implemented.
This could create a more structured (pun intended) way of organizing the different benchmarks than the current section-based layout.
josh146
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @DRovara 💪
|
|
||
| *Quantum Error Correction (QEC) is one of the most important applications of structured control flow in quantum computing. These benchmark programs implement QEC protocols that involve structured operations at any point in the program.* | ||
|
|
||
| - *Magic State Distillation*: Magic state distillation protocols utilize loops and conditionals to iteratively improve the fidelity of magic states, which are essential for fault-tolerant quantum computing. ([Bravyi & Kitaev, 2005](https://arxiv.org/abs/quant-ph/0403025)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to also consider magic state sythillation? Cultivation?
|
Thanks for compiling this @DRovara ! (I don't know if I have approval privileges, but it looks great, my comments are mostly just discussion points.) |
mark-koch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @DRovara ! Just leaving a few minor suggestions and possible discussion points
| - Maintenance Overhead: The addition of a new benchmark suite requires ongoing maintenance to ensure that the benchmarks remain relevant and up-to-date with the latest advancements in quantum computing and compiler technologies. | ||
| - Complexity: Introducing structured benchmarks may increase the complexity of the Jeff repository, potentially making it more challenging for new users to navigate and understand the available resources. | ||
| - Limited Adoption: If the benchmarks are not widely adopted by the quantum computing community, their impact may be limited, reducing the incentive for compiler developers to implement support for structured control flow. | ||
| - Comparisons are not simple: For users, it might not be stratighforward to know what to compare against when compiling these structured programs. For a full, fair evaluation, a meaningful baseline needs to be established and a more precise methodology for comparison is likely necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is quite important. It would be good if the benchmark challenge came with a way of measuring the performance of programs (e.g. T count, two-qubit gate count etc). In the presence of control-flow, this would probably require actually running the Jeff program:
- For programs without branching on mid-circuit measurement outcomes, we could just collect the applied gates in an execution trace - no need to do any actual quantum simulation since everything is deterministic. Still have to figure out how to actually execute jeff though
- For mid-circuit measurements, we could either:
- Run an actual quantum simulator and sample outcomes, averaging performance metrics over many shots. This doesn't scale to large programs.
- Use weighted random coin flips to classically pick outcomes. This might be unfair for optimisations that rely on realistic measurement distributions.
- Provide the measurement outcomes that should be tested against as part of the benchmark suite. This breaks once people develop optimisations that introduce/delete/reorder measurements.
But maybe this is complicated enough to be discussed in a different RFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think quantitative metrics + evaluation deserves a separate discussion.
| - *VQE Ansatz with Fixed Repetitions*: A variational ansatz circuit that applies a set of parameterized gates in a loop with a predetermined number of repetitions. ([Peruzzo et al., 2014](https://arxiv.org/abs/1304.3061)) | ||
| - *QAOA with Fixed Repetitions*: A Quantum Approximate Optimization Algorithm circuit that applies problem and mixer Hamiltonians in a loop with a fixed number of layers. ([Farhi et al., 2014](https://arxiv.org/abs/1411.4028)) | ||
|
|
||
| ## Static Loops with Dynamic Qubit Indexing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A different approach to generating these kinds of programs is taking flat QASM2 benchmarking circuits and trying to recover loop structure in them. In particular, we could have a look at this paper where they try to find polyhedral iteration domains to delinearise flat programs
| - Maintenance Overhead: The addition of a new benchmark suite requires ongoing maintenance to ensure that the benchmarks remain relevant and up-to-date with the latest advancements in quantum computing and compiler technologies. | ||
| - Complexity: Introducing structured benchmarks may increase the complexity of the Jeff repository, potentially making it more challenging for new users to navigate and understand the available resources. | ||
| - Limited Adoption: If the benchmarks are not widely adopted by the quantum computing community, their impact may be limited, reducing the incentive for compiler developers to implement support for structured control flow. | ||
| - Comparisons are not simple: For users, it might not be stratighforward to know what to compare against when compiling these structured programs. For a full, fair evaluation, a meaningful baseline needs to be established and a more precise methodology for comparison is likely necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think quantitative metrics + evaluation deserves a separate discussion.
bachase
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great RFC and also great feedback from others.
One high-level consideration, which might prompt a change to the RFC format, would be to add example use cases or user flows in the guide-level section. For example, what is the process for contributing a benchmark? How would I use an existing benchmark program to assess my compilers performance? How would I participate in the "challenge"?
I'm not blocking on that suggestion, as getting these initial benchmarks seems mostly clear. But it might help disentangle questions around how these benchmarks will be used and what is out of scope for this contribution.
|
|
||
| There are several potential drawbacks to consider with this proposal: | ||
|
|
||
| - Maintenance Overhead: The addition of a new benchmark suite requires ongoing maintenance to ensure that the benchmarks remain relevant and up-to-date with the latest advancements in quantum computing and compiler technologies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also spin up a separate jeff-bench repo or similar, at least to avoid any challenges managing dependencies for the core jeff code from the benchmark generation code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be good here. We can use inline script metadata for the Python scripts that generate benchmarks. uv has great support for these.
See https://docs.astral.sh/uv/guides/scripts/#declaring-script-dependencies for some good documentation on that.
dime10
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Damian, this is a great first RFC!
| | conditionals on originally classical values | Conditional blocks are used where the condition depends on values that were *not* measurement results. | | ||
| | conditionals on measurement results | Conditional blocks are used where the condition depends on values that depend on measurement results. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, for the conditional we separate out classical dynamic values from "quantum dynamic values" (i.e. measurement results), but we didn't do this for the loops (I think both are contained in the dynamically-bounded loops).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that could also be done, but I wanted to not add too many different categories.
At the end of the day, (I believe) the main reason why dynamically bounded loops are so difficult for compilers is because they cannot be "unrolled" at compile time. In that case, it does not really make it much more difficult if these values were measurement results or not.
As conditionals are a bit simpler than loops, on the other hand, adding this distinction there might be more helpful.
But that's not a strong opinion on my end, if we think this new category should be added, I am also fine with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm not necessarily advocating for splitting the loop category as well, but I'm also not sure why the distinction exists with conditionals. In my mind, a value is either available in advance (static) or computed at runtime (dynamic), whether from a measurement or not.
I guess the idea might be to distinguish dynamic values that are impossible to convert to static ones from "pseudo" dynamic ones, for example program arguments (and any value deterministically depending on them) can just be supplied and compilation specialized to this instantiation of arguments, so maybe they are not "true" dynamic values, versus measurement results which are impossible to know in advance. But you could ascribe this property to classical non-deterministic values as well.
At the end-of-end the day though, if a compiler doesn't receive those concrete instantiations for example, they will have the same challenge in compilation whether it's "true" dynamicism or not 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to some of my comments. From an algorithmic perspective I see an important distinction between the two. Is that distinction less important at the IR or compiler level? I was thinking that the results coming from the quantum device might be relevant for benchmarking metrics (or even have implications for program optimization?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to some of my comments. From an algorithmic perspective I see an important distinction between the two.
Could you elaborate on the importance of this distinction from an algorithmic perspective?
Is that distinction less important at the IR or compiler level?
Well I think from a practical "intermediate" compiler perspective the only thing that matters is whether you have access to these values or not. For instance, if you know the value of a conditional predicate, there is no reason not to flatten it. For loop bounds, you would have the choice to flatten, or not, or be able to make some other deductions (e.g. resource counts) if you know those values. But if you don't know the value of the conditional predicate, you have to preserve it considering both branches. Does it matter here whether the predicate is of quantum or classical origin? I'm not so sure.
Having said that, if you are compiling for a concrete architecture there may be a big difference. A quantum-origin conditional has to be compiled to some instructions on the control system, whereas a classical-origin one might be compiled to a side processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on the importance of this distinction from an algorithmic perspective?
It's admittedly hand-wavy. The value is (1) the result of a fundamentally different kind of computational process that runs on a separate device, and (2) there is nondeterminism beyond regular classical randomness, in that there could be errors due to noise, additional readout mitigation required, etc. that affect both the value of the variable and the subsequent branch it takes.
In any case, my argument is getting a bit more philosophical, and I don't want this to block the PR 😅
burgholzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @DRovara 👏🏼
Just spotted a typo and a missing dot. Otherwise this is spot on 🎯
|
|
||
| There are several potential drawbacks to consider with this proposal: | ||
|
|
||
| - Maintenance Overhead: The addition of a new benchmark suite requires ongoing maintenance to ensure that the benchmarks remain relevant and up-to-date with the latest advancements in quantum computing and compiler technologies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be good here. We can use inline script metadata for the Python scripts that generate benchmarks. uv has great support for these.
See https://docs.astral.sh/uv/guides/scripts/#declaring-script-dependencies for some good documentation on that.
Co-authored-by: Lukas Burgholzer <[email protected]>
Co-authored-by: Lukas Burgholzer <[email protected]>
This PR proposes RFC 0032, which advocates for the addition of a benchmark suite for structured quantum programs.
Problem
There is no standard benchmark suite for evaluating compiler support for structured control flow (e.g., if, for, dynamic indexing).
Solution
Create a set of benchmarks written in Jeff to fill this gap. This will help drive compiler development, allow for better tool evaluation, and highlight Jeff's capabilities in representing these advanced programs.
This RFC outlines the initial set of benchmarks and invites community feedback.
Rendered RFC