-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
CI Run Link: https://github.com/coder/coder/actions/runs/19254608199
Failed Job: test-go-pg (macos-latest)
Commit: c21b3e49b36e8aea1b743e5b76e9ac6d3b8e3339 (Atif Ali)
Date: 2025-11-11
Root cause classification: Flaky test (goroutine leak)
Evidence from logs:
Goroutine leak detected by goleak:
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 241684 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*MetricsCollector).BackgroundFetch on top of the stack:
github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*MetricsCollector).BackgroundFetch(0x14017a61d40, {0x10877f388, 0x140443aa320}, 0xdf8475800, 0x2540be400)
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/metricscollector.go:235 +0xac
github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func2()
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:155 +0x70
created by github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run in goroutine 241683
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:153 +0x368
Goroutine 241683 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run on top of the stack:
github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run(0x1403cd8bb00, {0x10877f350, 0x1403d76bda0})
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:188 +0x4f8
runtime/pprof.Do({0x10877e6f8?, 0x10b3c2560?}, {{0x14043dfeda0?, 0x1403d75d880?, 0x14043ff6960?}}, 0x1403d77fc20)
/Users/runner/work/_tool/go/1.24.10/arm64/src/runtime/pprof/runtime.go:51 +0x78
created by github.com/coder/coder/v2/coderd/pproflabel.Go in goroutine 240105
/Users/runner/work/coder/coder/coderd/pproflabel/pproflabel.go:10 +0xac
Goroutine 241685 in state select, with github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func3 on top of the stack:
github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run.func3()
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:173 +0xc4
created by github.com/coder/coder/v2/enterprise/coderd/prebuilds.(*StoreReconciler).Run in goroutine 241683
/Users/runner/work/coder/coder/enterprise/coderd/prebuilds/reconcile.go:171 +0x3d4
]
FAIL github.com/coder/coder/v2/enterprise/coderd 191.159s
Notes:
- No data race warnings present (checked for "WARNING: DATA RACE" and "race detected").
- No panic, OOM, or other process-crash indicators in logs.
- The failure was within minutes of the Slack alert; correct run was analyzed.
Best assessment:
- The enterprise/coderd package tests started a prebuilds StoreReconciler, which in turn started MetricsCollector.BackgroundFetch.
- These background goroutines were not shut down at test teardown, causing goleak to fail the package.
- Likely missing Stop() on the StoreReconciler during test cleanup or server teardown isn’t stopping the reconciler/collector.
Assignment analysis:
- Specific failing test function is not identified; goleak triggers after the package tests complete.
- The leaked goroutines originate from enterprise/coderd/prebuilds (StoreReconciler, MetricsCollector). Recent ownership/contributors in this area include Susana Ferreira and Sas Swart.
- Suggest assignment to prebuilds ownership for investigation. If ownership differs, please re-route accordingly.
Proposed next steps:
- Ensure StoreReconciler.Stop() is called in test teardowns where a reconciler is created or where the server starts it.
- Consider registering a t.Cleanup to Stop() any reconciler started during tests.
- Optionally gate metrics registration in tests or use a shorter context/interval to avoid long-lived background goroutines.
Related issues search:
- Searched coder/internal for duplicates with queries: "goleak", "goroutine leak", "MetricsCollector", "StoreReconciler", "BackgroundFetch" – none found open or recently closed.
Reproduction hint:
- Run enterprise/coderd package tests on macOS with -count=1 and observe if goleak triggers intermittently:
go test ./enterprise/coderd -count=1 -run .
Labels: flake