-
Notifications
You must be signed in to change notification settings - Fork 373
fix: Push Health query optimizations #9019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6e0bcca to
4ed8891
Compare
4ed8891 to
0617215
Compare
c7d8089 to
822d896
Compare
| push=push, | ||
| job_type__name__in=failed_job_types, | ||
| result__in=["success", "unknown"], | ||
| ).select_related("job_type", "machine_platform", "taskcluster_metadata") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make this all_jobs- we could have failing jobs for a different reason (i.e. infra, or other test) and it would be good to know that this test failed 2/6 times, even if the job failed 5/6 times. log parser does this automatically by detecting intermittents (about 70% of them) and marks them as failure_classification_id=8.
I would recommend a single query to get all the jobs and related fields, this would allow you to query within the data for failure classification and sort by job_type.name. Most pushes (especially on try) will have <500 jobs, so this seems preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be resolved in the next patch to further implement changes.
| # Get commit history for Hg repos | ||
| commit_history_details = None | ||
| if repository.dvcs_type == "hg": | ||
| commit_history_details = get_commit_history(repository, revision, push) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need commit history? that is the old method IIRC>
| ) and push_result != "fail": | ||
| push_result = metric_result | ||
| elif metric_result == "fail": | ||
| push_result = metric_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the concept of indeterminate or unknown is resolved by failure_classification_id, so we can simplify this code, basically the map is:
failure_classification_id=1 (pass/fail/unknown)
failure_classification_id=4 (classified - currently only by humans)
failure_classification_id=6 (new failure- finds all regressions on autoland/central; uses 21 day cache)
failure_classification_id=8 (intermittent needs bugid- log parsing determined that future data existed such that this didn't reproduce and needs to be classified with a bug. if more future data comes in and it is almost perma fail, this gets reset to id=1||6.
knowing that, we have determinations made and don't need logic to summarize this. Basically we have:
- new failure - highest priority to look at
- intermittent needs bug - ignore for now, small edge cases could result in real issues
- not classified (id=1) - everything else- ideally not much is left here that is a failure.
| "repo": repository.name, | ||
| "needInvestigation": len(push_health_test_failures["needInvestigation"]), | ||
| "author": push.author, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we are keeping newrelic around, I think sentry we are getting rid of, not sure about newrelic. @Archaeopteryx do you know about newrelic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops have suggested a planned to migrate to OpenTelemetry. I have not heard of changes to sentry use.
| devServer: { | ||
| host: '0.0.0.0', | ||
| port: 5000, | ||
| port: 5001, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this keeps sneaking in
822d896 to
50e5b7b
Compare
50e5b7b to
68f3e89
Compare
|
@camd should we condition to work on this, or assume what has landed is the path forward and assume any future changes will occur naturally unrelated to this specific PR? |
This PR simplifies the query Push health uses to find failures.