Skip to content

Conversation

@blakerouse
Copy link
Contributor

@blakerouse blakerouse commented Oct 28, 2025

What is the problem this PR solves?

Adds extra checks for the local_metadata field when an Elastic Agent checks in. This ensures that an empty string doesn't result in it trying to update the document to an invalid object.

Adds other checks to ensure that the bulk update never tries to use an empty []byte() to update metadata or components.

Fixes an issue in the checkin bulk that could result in a check-in previous values being lost if the check-in ticker has a very low interval (not real in the field).

How does this PR solve the problem?

It adds more defensive code checks and units tests to ensure that it remains that way.

How to test this PR locally

Unit tests are exercising the bad code paths, but reproduction in the field is unclear still.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

@blakerouse blakerouse self-assigned this Oct 28, 2025
@blakerouse blakerouse requested a review from a team as a code owner October 28, 2025 21:42
@blakerouse blakerouse added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-active-all Automated backport with mergify to all the active branches labels Oct 28, 2025
@prodsecmachine
Copy link

prodsecmachine commented Oct 28, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Licenses 0 0 0 0 0 issues
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

var agentLocalMeta interface{}
if err := json.Unmarshal(agent.LocalMetadata, &agentLocalMeta); err != nil {
return nil, fmt.Errorf("parseMeta local: %w", err)
if agent.LocalMetadata != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why is this nil check necessary? json.Unmarshal(nil, &agentLocalMeta) should result in agentLocalMeta == nil, which is the zero value of agentLocalMeta anyway, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this actually would fail if the local_metadata was missing. Seems like we don't actually hit this is the real world. Seems elasticsearch always returns it as local_metadata: {}. The unit test now tests both cases, so I added this here just to be more defensive.

blakerouse and others added 3 commits October 29, 2025 15:50
This is because the checkin bulker has a default timeout of 10 seconds, meaning the
original 10 seconds could result in it being missed by the check.
Co-authored-by: Shaunak Kashyap <[email protected]>
Co-authored-by: Shaunak Kashyap <[email protected]>
Copy link
Contributor Author

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator Thanks for the suggestions and review. I have applied them.

var agentLocalMeta interface{}
if err := json.Unmarshal(agent.LocalMetadata, &agentLocalMeta); err != nil {
return nil, fmt.Errorf("parseMeta local: %w", err)
if agent.LocalMetadata != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this actually would fail if the local_metadata was missing. Seems like we don't actually hit this is the real world. Seems elasticsearch always returns it as local_metadata: {}. The unit test now tests both cases, so I added this here just to be more defensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Updating local metadata on checkin fails and then updating the local metadata can no longer happen

3 participants