externalTrafficPolicy: Local via BGP #10537
-
hey guys, I'm wanting to use externalTrafficPolicy: Local for optimised endpoints, however when I do this with service type LoadBalancer the /32 external address, which is dished out from a calico loadbalancer ip pool, continues to be advertised from all nodes, the traffic only routes to the pod if i constrain BGP manually to only peer with the node which is running the pod. Here is an example: kubectl get pod,svc,ep -o wide
kubectl get nodes -o wide
This is the routing table on the TOR for the service address of 10.44.0.9.
If I constrain BGP on the TOR to only communicate with node rke2-test-worker-pool-58fmt-hzt7v on 10.254.32.105 then everything works fine. Also if I change the service to externalTrafficPolicy: Cluster then it works fine, as the endpoints are on all nodes. Reading this page says "The nodes with a pod backing the service advertise a specific route (/32 or /128) to the service's IP." but it doesn't appear to work for me. I'm running the below versions, any ideas?
|
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 9 replies
-
You may be hitting one of these scenarios: #7512, #3810 The TL;DR is that if you have full-mesh enabled between your Calico nodes, they will advertise the Service LoadBalancer IPs to each other, and may re-advertise those IPs - swapping themselves for the next hop - depending on your BGP topology. Fixes include:
|
Beta Was this translation helpful? Give feedback.
-
Cheers for assist, though it still isnt quite right. I tried dwith mesh disabled with route reflector on ToR and seperately with keepOriginalNextHop: true. None produce the right result. Here is what I see with keepOriginalNextHop: true. ToR has no route reflector and meshing is enabled.
I should see routes to 10.254.32.105 only, but I am also seeing 4 from 10.254.32.104. I restarted all bgp processes, deleted the calico node on 10.254.32.104 but it came back and continued to advertise on the node that does not run the pod for the service. Hmm. |
Beta Was this translation helpful? Give feedback.
-
Does not A route reflector is going to behave much the same as a full-mesh, as it's job is to distribute routes within the AS and effectively replace the need for a full mesh, so I'd expect similar results when using a RR. |
Beta Was this translation helpful? Give feedback.
-
ok so here is why, but i dont get it. calico node pod nodes as reference
output from calico-node -show-status on the pods
|
Beta Was this translation helpful? Give feedback.
-
so this is why but i dont know why, if you get me. The calico assigned external IP of 10.44.0.13/32 has been placed onto node 10.254.32.106. This is locally attached. The pod behind the service is on node 10.254.32.105. Calico is advertising a directly attached address, along side the endpointslice/pod behind the service. Is this a bug? ip config from node 10.25.32.106
config of the svc etc from kubectl to map it
|
Beta Was this translation helpful? Give feedback.
-
Solution for me, was to combine calico bgp peers with nodeSelectors, to exclude the control plane nodes who are hosting the direct IP of the service type load balancer on eth0 (which is being advertised out directly). I now correctly see paths to only those nodes whom are are hosting pods that are part of the service. externalTrafficPolicy: Local is now functioning correctly. Does sound like the documentation needs a bit of an edit to explain this better. As out the box as its described it doesn't "just work" - or its not a simple as the documentation suggests at least. |
Beta Was this translation helpful? Give feedback.
-
I just wanted to be a little clearer on what exactly worked for me, the below configuration is the only way I got it working.
With meshing enabled, the control plane node that has the service IP address directly on eth0, is re-advertised back to the ToR. Interestingly, "keepOriginalNextHop: true" made no difference to the received routes, on the ToR. This was for some reason ignored for the directly connected IP on eth0 on the control plane node. It's a pretty basic setup I think. The question arises for me, can you configure BIRD via Calico to ignore locally connected addresses on eth0? What effect does that have on the wider system? Given the fact I've tailored the BGP routing to effectively ignore the directly connected IP on eth0, it seems it wont have a negative effect - well for services of type LoadBalancer at least. Most of the interfaces we care about are on cali* interfaces - why is this external load balancer ip on eth0? It might be worthy of a deeper investiation to better align things. Let me know if you want any more infomation than we already have, Calico Config
Basic config of deployment and service - my namings and use of container chop and change a lot, so forgive me for it not being overly tidy.
|
Beta Was this translation helpful? Give feedback.
I just wanted to be a little clearer on what exactly worked for me, the below configuration is the only way I got it working.
With meshing enabled, the control plane node that has the service IP address directly on eth0, is re-advertised back to the ToR. Interestingly, "keepOriginalNextHop: true" made no difference to the received routes, on the ToR. This was for some reason ignored for the directly connected IP on eth0 on the control plane node.
It's a pretty basic setup I think. The question arises for me, can you configure BIRD via Calico to ignore locally connected addresses on eth0? What effect doe…