Skip to content

Conversation

pcineverdies
Copy link
Collaborator

@pcineverdies pcineverdies commented Aug 22, 2025

This PR introduces the code for "Straight to the Queue", as described in the original paper "FPGA-23".
The aim of the pass is to simplify the connection between the circuit and the LSQ, speeding-up the block activation within the LSQ by relying on data-dependencies only.
The pass can be activated with the flag --straight-to-queue during the compile stage.
This has no impact on the main flow, which is kept identical.

The following tests are failing when combining Fast Token Delivery with this pass:

  • gemver, insertion_sort, gemver_float (which were not working since the initial implementation in my thesis);
  • lu (which was weirdly working some months ago, so it might be related to some changes and this pass being not aligned with them);
  • polyn_mult (which works when the network of cmerges is not removed from the circuit, so it's likely a timing issue related to the termination of the circuit);
  • kmp and iterative_sqrt, which are failing with FTD alone, as mentioned in Discussion around FTD an main Dynamatic flow #571 .

@AyaElAkhras
Copy link
Member

Thanks, Giacomo!

For gemver, insertion_sort and gemver_float, I recall that they all suffered from the same issue you identified during simulation: the LSQ deadlocked when it received multiple requests in a single cycle. Is that it? If so, could you please document this in a new issue?

As for lu, what kind of failure does it get?

For polyn_mult, the issue has to do with missing proper ACK signal from store operations to ensure that FTD returns only when all stores are actually completed.

/// information among them, create a group graph.
static DenseSet<MemoryGroup *>
constructGroupsGraph(SmallVector<handshake::MemPortOpInterface> &lsqOps,
SmallVector<ProdConsMemDep> &lsqMemDeps) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triviality but lsqMemDeps should be a const since it is not modified inside the function, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely!

}

// Add a self dependency each time you have a group with no dependency
for (MemoryGroup *g : groups) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it has been a while, but by any chance do you recall why this is needed? Why would we consider any group that has no dependency to be dependent on itself? Is it like a default value that will be in any case replaced by startValue when we call createPhiNetworkDeps? If so, is there a need to have it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried going back in history (with some git blame and git log) to understand what was going on, but the amount of refactoring I did one year ago is emberassing. As soon as possible I will launch an integration test as wee what is going on when removing such for loop. Seeing the difference in results might help us coming up with an educate answer.

My feeling is that this will allow the insertion of a phi node at the first node of the graph (one with no predecessors) and receiving the start value this way allows to consider each node in the same way, no matter its position in the graph. This is clearly just a guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants