-
Notifications
You must be signed in to change notification settings - Fork 515
Description
We have a scenario we are encountering system test failures for the websocket and proofpoint_on_demand integrations in stack versions 8.19 and above. Both of these integrations utilise the streaming input underneath.
Initial analysis:
In 8.19 and above we updated the metrics in this PR Line: 314,
To include a metrics state update from HEALTHY -> DEGRADED under scenarios where there is a connection closure. Under real world circumstances this is a correct approach and the metrics are justified, but this is causing issues with our system tests where our mock server container is often shutdown after successful tests suits and spun-up fresh for the next data stream tests. This could be a potential culprit.
There seem to be multiple connection state degrades when the initial connection is established, leading to error log bloat in elastic-agent.
Build failuers -
We do see a high volume of degraded logs which is unusual so there might be other underlying issues in our mock server or the streaming input which might have gone unnoticed.