Skip to content

Conversation

kev1N916
Copy link

What type of PR is this?

/kind bug

What this PR does / why we need it:

Currently after a job is submitted and is completed the job.status.succeeded and job.status.failed fields do not get populated.

When the Job is in the Running Phase we are able to see the status of all the pods
image

However as show in #4616 when the pod is in the completed phase we were not able to see the status of any of the pods.

Now we will be able to see the status of the pods even in the Completed phase
image

Which issue(s) this PR fixes:

Fixes #4616

@volcano-sh-bot volcano-sh-bot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 28, 2025
Copy link

Summary of Changes

Hello @kev1N916, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where the job.Status.Succeeded and job.Status.Failed fields were not being updated for completed jobs, preventing users from seeing the final status of individual pods. By introducing a mechanism to track and process 'retained' pods (those not marked for termination), the change ensures that the job's overall status accurately reflects the outcome of all its constituent pods, thereby improving the observability of completed jobs.

Highlights

  • Bug Fix: Job Status Population: Addresses a bug where job.Status.Succeeded and job.Status.Failed fields were not being populated correctly after a job completed, leading to a lack of visibility into individual pod statuses in the completed phase.
  • Retained Pod Processing: Introduces a new map, podsToBeRetained, to explicitly track pods that are not being terminated. These retained pods are now properly processed to ensure their status contributes to the overall job status.
  • Enhanced Status Visibility: Modifies the killPods function to iterate over podsToBeRetained and apply status classification and calculation, ensuring that the job's status accurately reflects the state of all pods, even after completion.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hzxuzhonghu
You can assign the PR to them by writing /assign @hzxuzhonghu in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 28, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug where the status of completed jobs was not being populated with the number of succeeded and failed pods. The change introduces logic to account for retained pods when a job is killed.

My review includes two main points:

  1. A high-severity issue where targeted pod/task kills will lead to incorrect job status because retained pods are not being accounted for.
  2. A medium-severity suggestion to refactor the pod processing loops to reduce code duplication and improve maintainability.

Overall, the core change is correct, but the related logic for targeted kills needs to be addressed to prevent incorrect behavior.

@kev1N916 kev1N916 force-pushed the fix-job_controller branch 3 times, most recently from dfadff8 to 0b2632c Compare September 29, 2025 13:02
@kev1N916
Copy link
Author

kev1N916 commented Oct 2, 2025

@hwdef Could you review this when you find the time?

@kev1N916 kev1N916 force-pushed the fix-job_controller branch 8 times, most recently from 27ae2ad to 99395fb Compare October 2, 2025 13:12
@kev1N916 kev1N916 force-pushed the fix-job_controller branch from 99395fb to 8c454c3 Compare October 2, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

job.status.succeeded Is Not Correctly Populated.
2 participants