-
Notifications
You must be signed in to change notification settings - Fork 0
Description
As of November 6 this query is our 2nd most expensive query over the last 7d in terms of load over time.
It's not called too frequently (30k calls in a week) or returning a large data set (avg 92 rows), but it's avg. latency is > 160ms and the p99 is more like 740ms. The main source of the cost here is sequential scans of provisioner_jobs and workspace_builds in order to find the first successful build per workspace.
Currently for workspace builds we have an explicit index on only the initiator ID and the actual build ID (PK), then implicit indices as a result of UNIQUE constraints for (workspace_id, name), (workspace_id, build_number), and (job_id). For provisioner_jobs we have explicit indices on started_at, job_status, and worker_id, organization_id, completed_at).
We do not have indices on either workspace_builds.transition nor provisioner_jobs.job_status and thus have to do sequential scans of both tables in order to fetch the relevant data. For ourselves internally this is scans of ~80k rows (33k returned) and ~92k rows (70k returned) respectively. The join and sort after this are also relatively expensive.
The join (from the plan) on pj.id = wb.job_id also results in a relatively high # of matches (~30k, so roughly 90% of workspace_build start builds succeeded).
I would propose adding indices similar to these:
- btree index for
workspace_buildson(workspace_id, build_number, id). This index will already be ordered so should be able to eliminate the need for the sort after the join. AI is also telling me that if we include the other fields from the query in the index, those beingtemplate_version_preset_id, initiator_id, and job_id, we can speed the query up/reduce the cost further by eliminating the need for heap fetches from theworkspace_buildstable itself, but I'm not sure of this optimization nor whether we want the extra index size that would come from this. - an index for
provisioner_jobseither on(job_status, id)OR on(id) WHERE job_status = 'succeeded'. The latter would help only this query, a more general index on(job_status, id)would help potentially multiple queries such asGetWorkspacesEligibleForTransition,GetFailedWorkspaceBuildsByTemplateID,CountPendingNonActivePrebuilds, and more.