Skip to content

[Buffering] Suboptimal Throughput with Memory Store (FPGA20, FPL22) #529

@shundroid

Description

@shundroid

The buffering algorithm reports the program below can achieve a throughput of 1.0:

# ================= #
# CFDFC Throughputs #
# ================= #

Throughput of CFDFC #0: 1.000000e+00

However, the actual observed throughput is only 1/6, significantly lower than the reported value.
This discrepancy occurs with both the FPGA20 and FPL22 algorithms.

When I manually insert buffers, the design achieves the expected throughput of 1.0.

It seems that buffers before memory stores are being skipped. The MLIP results also suggest not placing buffers in these locations.

I’m currently investigating the issue.

Program:

#include "test_loop.h"
#include "dynamatic/Integration.h"
#include <stdlib.h>

#define N 100
void test_loop(int a[N], int b[N]) {
  for (int i = 0; i < N; i++) {
    b[i] = a[i] * i;
  }
}

int main(void) {
  int a[N];
  int b[N];

  srand(13);
  for (int j = 0; j < N; j++) {
    a[j] = rand() % 100;
  }

  CALL_KERNEL(test_loop, a, b);
  return 0;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions