Decision log plugin: don't drop events when reaching buffer limit, slow down and retry #7454

sspaink · 2025-03-18T22:03:07Z

What is the underlying problem you're trying to solve?

The decision log plugin manages a buffer. By default this buffer has an "unlimited" size, so valid logged events will only be dropped if OPA crashes due to running out of memory (worst case scenario). To prevent a OOM crash the user can configure a buffer size limit with buffer_size_limit_bytes. Then if the buffer fills up the oldest events will be dropped to make room for new incoming events (works as a circular buffer). The problem with dropping events when the limit is reached is it sacrifices audibility for latency. This causes an issue in scenarios where every event is critical and should never be dropped, preferring to slow down then lose data.

In an upcoming PR a new buffer type will be introduced that doesn't support an "unlimited" size where it also defaults to dropping events when the limit is reached. Both this new and the current buffer could benefit with a way to keep all incoming events.

Describe the ideal solution

A new configuration option to change the behavior to never drop an incoming event but to instead keep retrying until there is room in the buffer. Maybe also a configuration option to change how long to retry before giving up.

Ideally there would also be a way to communicate back to the log producer with an error code ( for example: HTTP 429 Too Many Requests) so that the producer can either slow down or start sending events to a different OPA instance.

Additional Context

#5724

The text was updated successfully, but these errors were encountered:

stale · 2025-04-17T22:31:59Z

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

sspaink added the feature-request label Mar 18, 2025

sspaink mentioned this issue Mar 18, 2025

Make latency impact of decision logs predictable #5724

Closed

stale bot added the inactive label Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decision log plugin: don't drop events when reaching buffer limit, slow down and retry #7454

Decision log plugin: don't drop events when reaching buffer limit, slow down and retry #7454

sspaink commented Mar 18, 2025

stale bot commented Apr 17, 2025

Decision log plugin: don't drop events when reaching buffer limit, slow down and retry #7454

Decision log plugin: don't drop events when reaching buffer limit, slow down and retry #7454

Comments

sspaink commented Mar 18, 2025

What is the underlying problem you're trying to solve?

Describe the ideal solution

Additional Context

stale bot commented Apr 17, 2025