-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat(aws): add support for reloadable static creds #5577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @krishicks! |
Hi @krishicks. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
b324e3c
to
b53c707
Compare
What prevents from using established methods like Vault Agent (https://www.hashicorp.com/en/blog/refresh-secrets-for-kubernetes-applications-with-vault-agent) to restart a service upon credential updates? Furthermore, why are we not leveraging Pod Identity, IRSA, or similar robust identity management solutions? |
Good question! Normally, we use amazon-eks-pod-identity-webhook with IRSA to provide temporary credentials to pods on CSP-managed Kubernetes. However, for non-CSP-managed Kubernetes, we're investigating using Vault configured with JWT validation pubkeys to have Vault issue these temporary credentials via the external-secrets operator's VaultDynamicSecret. This is to avoid having to expose the OIDC discovery information for the cluster, which by definition must be publicly accessible and unauthenticated. Using Vault Agent would mean having to run a sidecar, which means running yet another component that needs to be monitored, maintained, and updated. Additionally, the way the Vault Agent works is to disrupt the running container to get it to reload, which isn't great as external-dns runs as a singleton. It would also be restarted up to 24 times per day as the temporary credentials we supply have a TTL of one hour. This PR implements a solution that makes it just work if someone wants to utilize aws/aws-sdk-go#3163 This PR implements the same kind of automatic refreshing of the credentials file that has been implemented in |
When supplying credentials with Hashicorp Vault, users create a file with static credentials that's referenced by the environment variable `AWS_SHARED_CREDENTIALS_FILE`. The contents of this file are temporary credentials that are rotated by Vault on a continuous basis to ensure they're always valid. AWS SDK for Go (v2) presently only reads this file once on load, so when the credentials expire it will never see the updates supplied by Vault. To fix this, we create a credentials provider which loads the default config on each retrieval. As a result, the credentials should always be the latest credentials supplied by Vault.
I'm not sure where or not unit test is correct. In func Test_newV2Config_WithRefresh(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
done := make(chan struct{})
var cfg awsv2.Config
var err error
dir := t.TempDir()
credsFile := filepath.Join(dir, "credentials")
t.Setenv("AWS_SHARED_CREDENTIALS_FILE", credsFile)
err = os.WriteFile(credsFile, []byte(`
[default]
aws_access_key_id=AKID1234
aws_secret_access_key=SECRET1
`), 0777)
require.NoError(t, err)
cfg, err = newV2Config(AWSSessionConfig{})
go func() {
defer close(done) // Ensure the channel is closed at the end
// Execute the function
cfg, err = newV2Config(AWSSessionConfig{})
require.NoError(t, err)
}()
// Wait for the channel to close or context to be canceled
select {
case <-done:
// Test completed successfully
case <-ctx.Done():
t.Fatal("Test timed out or was canceled")
}
fmt.Println("creds first call >>>")
fmt.Println(cfg.Credentials.Retrieve(context.Background()))
err = os.WriteFile(credsFile, []byte(`
[default]
aws_access_key_id=AKID2567
aws_secret_access_key=SECRET2
`), 0777)
require.NoError(t, err)
fmt.Println("creds second call >>>")
fmt.Println(cfg.Credentials.Retrieve(context.Background()))
} Go routine most likely not required. |
I'm not sure I understand the concern with my test. In my test as committed it only calls The test you wrote will block on Can you restate the concern with my test? |
Tests are fine. My primary concern with this Pull Request is that it appears to read the file system on every API request to AWS, which I don't believe is a recommended approach for credential handling. Typically, credentials should only be read or refreshed when necessary. In my opinion, this behavior should be supported at the AWS SDK level. I would suggest opening a new issue or PR in the What is required;
Feature itself make sense as well as the use case. Therefore, this feature will be put on hold for now. Relevant issue aws/aws-sdk-go-v2#2135, and similar which outlines how credential refresh decisions are currently made. /hold |
I don't think getting it added in aws-sdk-go-v2 is viable given they closed that issue saying to do exactly what I've done here, implementing our own provider. How about we implement a different kind of provider, then? What we really want here is to be able to get credentials from Vault, so what if we implemented a provider that would do that? It could use JWT or Kubernetes auth methods and only fetch new creds from Vault when the creds it already has expires. If that sounds like a better option, I'm happy to implement that. It would mean adding config so the user could say "for AWS, use Vault, and here's its address and the auth method/mount path/role to use". |
We actually moving providers out-of-tree #4347 So only a webhook is an option at the moment for bespoke setup. https://kubernetes-sigs.github.io/external-dns/latest/docs/tutorials/webhook-provider/ My concern is that adding this as a custom credentials resolver, without first engaging the go-sdk2 team for a library improvement, is not a scalable solution. In a common Kubernetes environment, ExternalDNS is rarely the only add-on installed. Clusters often run components like autoscaler, Karpenter, VPC CNI, EBS CSI, API Gateway, and many more. Implementing a custom credential resolver in every single one of these projects is unsustainable. Therefore, there's a strong case for integrating this functionality directly into the SDK library as one of its standard configuration options. At the meantime, there is a compensation from external-secrets team https://github.com/external-secrets-inc/reloader I strongly feel this feature is critically needed in the SDK, |
I have not tried, but if static credentials expired, the library should in theory refresh credentials when an API call explicitly returns an ExpiredToken or a similar authentication error. Is this currently not happening? |
It seems it has moved as a community FR and needs upvote to be prioritized: aws/aws-cli#9034 Did you take a look at external process for credentials ? like what exists for terraform provider ? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
What does it do ?
Creates a credentials provider which loads the default config on each retrieval when not using AssumeRole.
Fixes #3713
Motivation
When supplying credentials with Hashicorp Vault, users create a file with static credentials that's referenced by the environment variable
AWS_SHARED_CREDENTIALS_FILE
. The contents of this file are temporary credentials that are rotated by Vault on a continuous basis to ensure they're always valid.AWS SDK for Go (v2) presently only reads this file once on load, so when the credentials expire it will never see the updates supplied by Vault.
More