Skip to content

Conversation

@mwmix
Copy link
Contributor

@mwmix mwmix commented Jul 20, 2025

What does it do ?

Fixes an issue when you have the AWS_CA_BUNDLE set and are using the 'aws' provider. The error is as follows:

instantiating AWS config: unable to add custom RootCAs HTTPClient, has no WithTransportOptions, *http.Client

Motivation

This causes me a ton of headache recently and I was forced to use a Debian based container of the tool as opposed to the scratch based one.

More

  • [ x ] Yes, this PR title follows Conventional Commits
  • [ x ] Yes, I added unit tests
  • [ x ] Yes, I updated end user documentation accordingly

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. provider Issues or PRs related to a provider labels Jul 20, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 20, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @mwmix. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 20, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuring metrics per provider not a great idea. Please avoid that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, I've been struggling to find a better implementation. Any insights would be appreciated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

		config.WithHTTPClient(extdnshttp.NewInstrumentedClient(AWSHTTPCLIENT)),

^ Something simliar

What happens if we simply add WithCustomCABundle, my understanding it could work, even if we wrapping http client

Copy link
Contributor Author

@mwmix mwmix Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to come up with a way to do what you suggested before I dived down the middleware rabbit hole. But, I kept hitting road blocks. The closest I could get was this and we get a type error:

func NewInstrumentedAWSClient(next *awshttp.BuildableClient) *awshttp.BuildableClient) {
    next.WithTransportOptions(func(transport *http.Transport) {
			transport = NewInstrumentedTransport(next)
		})
}

The type error is

cannot use next (variable of type *"github.com/aws/aws-sdk-go-v2/aws/transport/http".BuildableClient) as "net/http".RoundTripper value in argument to NewInstrumentedTransport: *"github.com/aws/aws-sdk-go-v2/aws/transport/http".BuildableClient does not implement "net/http".RoundTripper (missing method RoundTrip) 

So the WithCustomCABundle would "work" and so does just updating my systems trusted ca list. However, the moment we set AWS_CA_BUNDLE even with the WithCustomCABundle set then we still hit that error. One workaround would be to use WithCustomCABundle pulling the value from AWS_CA_BUNDLE and then unsetting the environment variable so the SDK doesn't use it. But, I imagine that has it's own issues ...

When looking at other projects such as otel and based on my own reading of the documentation it seemed like this was the expected interface for adding this sort of functionality.

I'm not really married to any particular solution and appreciate your feedback. I've been trying my best to minimize the impact of the changes while preserving the existing instrumentation but haven't found any cleaner way around it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have access to AWS. If you could reproduce it with kind+localstack or similar, might be able to have a look.

Have you tried WithCustomCABundle?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well, have you tried for example

config.WithHTTPClient(extdnshttp.NewInstrumentedClient(&http.Client{Transport: transport.NewBuildableClient().GetTransport()})),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried WithCustomCABundle() and the error still appears when I try to run with the AWS_CA_BUNDLE defined.

I just also finished testing the last example you gave me above with the same result :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha

Copy link
Member

@ivankatliarchuk ivankatliarchuk Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note about opentelemetry. It's ok to have a middleware support, but still expected to to support RoundTrippers https://github.com/open-telemetry/opentelemetry-go-contrib/blob/caee80916a50f168c7152967dabaefd0c3cd17c0/instrumentation/net/http/otelhttp/transport.go#L26

Basically to be a consistent across multiple libraries. With aws there is no such option, it's quite aws specific

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep everything within the existing package sigs.k8s.io/external-dns/pkg/http, unless there’s a circular dependency that prevents it or other valid reason?

Please note that packages named utils are not being approved at this time.
For reference, see this example where a similar utils package was proposed but not accepted: #5189 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, ty for the insight. I'll update the MR appropriately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah absolutely, I didn't realize you already had some stashed away in the repo! :)

@mwmix mwmix force-pushed the fix-aws-ca-bundle branch 3 times, most recently from 7b1027a to ae5a1bd Compare July 20, 2025 12:50
Copy link
Member

@ivankatliarchuk ivankatliarchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a simple reason why certificate bundle is not loaded. Hard to say without how-to-reproduce example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well, have you tried for example

config.WithHTTPClient(extdnshttp.NewInstrumentedClient(&http.Client{Transport: transport.NewBuildableClient().GetTransport()})),

@ivankatliarchuk
Copy link
Member

The WHY is not working is described here https://github.com/aws/aws-sdk-go-v2/blob/f9f7a6bb124a1a7daffc65db40053d97678bd371/config/env_config.go#L174-L189. In principal, AWS sdk instead of Transport should have RoundTripper, as go http library does.

Could we add to our instrumenter an option to wrap and return Transport if config.WithHTTPClient(extdnshttp.NewInstrumentedClient(&http.Client{Transport: transport.NewBuildableClient().GetTransport()})), not going to work?

Middleware is an OK way, but AWS library middleware is just to complex.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 21, 2025
@ivankatliarchuk
Copy link
Member

ivankatliarchuk commented Jul 21, 2025

Technically, if we passing a Transport (cfg.HTTPClient.(*awshttp.BuildableClient)) then resolveCustomCABundle https://github.com/aws/aws-sdk-go-v2/blob/f9f7a6bb124a1a7daffc65db40053d97678bd371/config/resolve.go#L42 should attach TLS certs.

Just need to find out how to enhance it with our metrics.

@ivankatliarchuk
Copy link
Member

I'm unsure actually. I do get why AWS team done it that way, but not simple to extend.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 21, 2025
@mwmix mwmix force-pushed the fix-aws-ca-bundle branch from ae5a1bd to 58a3257 Compare July 21, 2025 23:21
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 21, 2025
@ivankatliarchuk
Copy link
Member

/retitle fix(aws): support aws_ca_bundle

@k8s-ci-robot k8s-ci-robot changed the title Fix aws ca bundle fix(aws): support aws_ca_bundle Jul 22, 2025
pkg/http/http.go Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this should stay. The plan is to wrap it like here

RegisterMetric.MustRegister(NewGaugeFuncMetric(prometheus.GaugeOpts{
or here
registryErrorsTotal = metrics.NewCounterWithOpts(
as we have documentation for metrics automated https://github.com/kubernetes-sigs/external-dns/blob/master/docs/monitoring/metrics.md

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this, may affect kube monitoring as well, which is not desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a new commit which refactors the fix to use the common metrics registry. Does that work? Or do we want to split that into a different PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have time, would be better to slice PRs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I've modified this PR so it has what I hope is the minimum set of changes you were hoping for. I'll have another PR out either later tonight / tomorrow which will be branched off this with the other changes I made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened PR #5677 for the metrics refactoring.

@mwmix mwmix force-pushed the fix-aws-ca-bundle branch from 58a3257 to d7eb392 Compare July 22, 2025 22:27
@k8s-ci-robot k8s-ci-robot added docs internal Issues or PRs related to internal code metrics Issues or PRs related to metrics labels Jul 22, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 5, 2025
Copy link
Collaborator

@mloiseleur mloiseleur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM.
Like for the previous change on metrics, the latest refactor asked by @ivankatliarchuk should go in its own dedicated PR.

=> Would you please extract the code in pkg/http & pkg/metrics in a dedicated refactor PR ?

@mwmix mwmix force-pushed the fix-aws-ca-bundle branch from 90d10ba to 474d1fc Compare August 5, 2025 22:42
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 5, 2025
@mwmix
Copy link
Contributor Author

mwmix commented Aug 5, 2025

The code LGTM. Like for the previous change on metrics, the latest refactor asked by @ivankatliarchuk should go in its own dedicated PR.

=> Would you please extract the code in pkg/http & pkg/metrics in a dedicated refactor PR ?

Yep can do, new PR opened #5717.

@mwmix mwmix force-pushed the fix-aws-ca-bundle branch 2 times, most recently from 5a6a8b3 to 45c8da8 Compare August 6, 2025 01:45
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this works without issues. The package pkg/http depends on pkg/metrics. Interesting, I would expect cyclic dependency

@mloiseleur
Copy link
Collaborator

/lgtm
@ivankatliarchuk I'll let you proceed with the final review.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 6, 2025
@ivankatliarchuk
Copy link
Member

It seems like no issues. so lgtm as well

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 6, 2025
@ivankatliarchuk
Copy link
Member

ivankatliarchuk commented Aug 6, 2025

Fixes #5666

@ivankatliarchuk
Copy link
Member

ivankatliarchuk commented Aug 6, 2025

Screenshot 2025-08-06 at 09 41 33

Looks like on 1CPU machine it will not work, want compile or compile with some flakyness.

@ivankatliarchuk
Copy link
Member

/remove-approve

@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 6, 2025
@mwmix mwmix force-pushed the fix-aws-ca-bundle branch from 45c8da8 to 937be24 Compare August 6, 2025 22:25
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 6, 2025
@mwmix
Copy link
Contributor Author

mwmix commented Aug 6, 2025

Ah I see the issue this isn't using the refactored stuff. I've re-based everything and pushed updates. I can build locally and run.

Copy link
Member

@ivankatliarchuk ivankatliarchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ivankatliarchuk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2025
@k8s-ci-robot k8s-ci-robot merged commit 6bda906 into kubernetes-sigs:master Aug 7, 2025
14 checks passed
troll-os pushed a commit to FiligranHQ/external-dns that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. docs internal Issues or PRs related to internal code lgtm "Looks good to me", indicates that a PR is ready to be merged. metrics Issues or PRs related to metrics ok-to-test Indicates a non-member PR verified by an org member that is safe to test. provider Issues or PRs related to a provider size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants