-
Notifications
You must be signed in to change notification settings - Fork 49
feat: Add credential test with DNS retry logic #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
@@ -354,3 +354,29 @@ def run_kubectl_apply(yml_string: str, task: str, args: Namespace) -> int: | |||
command = f'kubectl apply -f {str(tmp.file.name)}' | |||
err_code = run_command_with_updates(command, task, args) | |||
return err_code | |||
|
|||
def run_command_and_capture_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason I added this function is that I need the output string along with a 0 or 1 return code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return_code
is usually the second parameter across entire codebase. tuple[<result>, <error_code>]
. Could you swap the order of values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -64,6 +65,7 @@ def set_jobset_on_cluster(args) -> int: | |||
command = ( | |||
'kubectl apply --server-side -f' | |||
f' https://github.com/kubernetes-sigs/jobset/releases/download/{JOBSET_VERSION}/manifests.yaml' | |||
' --force-conflicts' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might require @pawloch00 to confirm whether this flag is needed. The details are described in this ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 @pawloch00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scaliby do you happen to have an opinion about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SujeethJinesh I read the ticket. IMO this is fine.
@SujeethJinesh Please review this PR. My current default for get-credentials is --dns-endpoint, because I see that the create cluster command also defaults to --enable-dns-access. |
src/xpk/core/cluster.py
Outdated
) | ||
xpk_print(kubectl_output) | ||
|
||
if kubectl_return_code != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if dns_endpoint_error in kubectl_output:
can be inversed to reduce nesting.
if dns_endpoint_error not in kubectl_output:
xpk_print(f'kubectl failed with an unhandled error: {kubectl_output}')
xpk_exit(kubectl_return_code)
# The rest of the logic
Also if kubectl_return_code != 0:
could be inversed further reducing nesting:
if kubectl_return_code == 0:
xpk_print("Credentials test suceeded")
return 0
# The rest of the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -354,3 +354,29 @@ def run_kubectl_apply(yml_string: str, task: str, args: Namespace) -> int: | |||
command = f'kubectl apply -f {str(tmp.file.name)}' | |||
err_code = run_command_with_updates(command, task, args) | |||
return err_code | |||
|
|||
def run_command_and_capture_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return_code
is usually the second parameter across entire codebase. tuple[<result>, <error_code>]
. Could you swap the order of values?
@scaliby I modified the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, just needed confirmation about --force-conflicts
@@ -64,6 +65,7 @@ def set_jobset_on_cluster(args) -> int: | |||
command = ( | |||
'kubectl apply --server-side -f' | |||
f' https://github.com/kubernetes-sigs/jobset/releases/download/{JOBSET_VERSION}/manifests.yaml' | |||
' --force-conflicts' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SujeethJinesh I read the ticket. IMO this is fine.
…/AI-Hypercomputer/xpk into lidanny/feature/check_dns_access
@DannyLiCom any updates on this? |
Fixes / Features
Testing / Documentation
To verify this feature, run the standard
xpk cluster create-pathways
command. The new credential test with DNS retry logic will be executed as part of the cluster creation process.https://screenshot.googleplex.com/cTjV7ABQr2omQZd
If a DNS endpoint-related error occurs, the following message will be displayed, followed by an automatic retry:
https://screenshot.googleplex.com/BunMupVA8Yp998o
However, because the --enable-dns-access flag is present by default, the error mentioned above will not appear.