Equinix Metal

BGP with Equinix Metal

When deploying Kubernetes with Equinix Metal with the --controlplane functionality we need to pre-populate the BGP configuration in order for the control plane to be advertised and work in a HA scenario. Luckily Equinix Metal provides the capability to "look up" the configuration details (for BGP) that we need in order to advertise our virtual IP (VIP) for HA functionality. We can either make use of the Equinix Metal API or we can parse the Equinix Metal Metadata service.

Note If this cluster will be making use of Equinix Metal for type:LoadBalancer (by using the Equinix Metal CCM) then we will need to ensure that nodes are set to use an external cloud-provider. Before doing a kubeadm init|join ensure the kubelet has the correct flags by using the following command echo KUBELET_EXTRA_ARGS=\"--cloud-provider=external\" > /etc/default/kubelet.

Configure to use a container runtime

Get latest version

We can parse the GitHub API to find the latest version (or we can set this manually)

KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")

or manually:

export KVVERSION=vx.x.x

The easiest method to generate a manifest is using the container itself, below will create an alias for different container runtimes.

containerd

alias kube-vip="ctr image pull ghcr.io/kube-vip/kube-vip:$KVVERSION; ctr run --rm --net-host ghcr.io/kube-vip/kube-vip:$KVVERSION vip /kube-vip"

Docker

alias kube-vip="docker run --network host --rm ghcr.io/kube-vip/kube-vip:KVVERSION"

Creating HA clusters in Equinix Metal

Creating a manifest using the API

We can enable kube-vip with the capability to discover the required configuration for BGP by passing the --metal flag and the API Key and our project ID.

1export VIP= metal_EIP  
2export INTERFACE=<interface>

where metal_EIP is the Elastic IP (EIP) address your requested via Metal's UI or API. For more information on how to request a Metal's EIP, please see the following Equinix Metal's EIP document is the interface you announce your VIP from via BGP. By default it's lo:0 in Equinix Metal.

1kube-vip manifest pod \
2    --interface $INTERFACE\
3    --vip $VIP \
4    --controlplane \
5    --services \
6    --bgp \
7    --metal \
8    --metalKey xxxxxxx \
9    --metalProjectID xxxxx | tee  /etc/kubernetes/manifests/kube-vip.yaml

where metalKey is your "personal API key" under "Personal Settings" of your Metal's portal, and MetalProjectID is your Metal's "Project ID" under "Project Settings"

Creating a manifest using the metadata

We can parse the metadata, however it requires that the tools curl and jq are installed.

 1kube-vip manifest pod \
 2    --interface $INTERFACE\
 3    --vip $VIP \
 4    --controlplane \
 5    --services \
 6    --bgp \
 7    --peerAS $(curl https://metadata.platformequinix.com/metadata | jq '.bgp_neighbors[0].peer_as') \
 8    --peerAddress $(curl https://metadata.platformequinix.com/metadata | jq -r '.bgp_neighbors[0].peer_ips[0]') \
 9    --localAS $(curl https://metadata.platformequinix.com/metadata | jq '.bgp_neighbors[0].customer_as') \
10    --bgpRouterID $(curl https://metadata.platformequinix.com/metadata | jq -r '.bgp_neighbors[0].customer_ip') | sudo tee /etc/kubernetes/manifests/vip.yaml

Load Balancing services on Equinix Metal

Below are two examples for running type:LoadBalancer services on worker nodes only and will create a daemonset that will run kube-vip.

NOTE This use-case requires the Equinix Metal CCM to be installed prior to the kube-vip setup and that the cluster/kubelet is configured to use an "external" cloud provider.

1export INTERFACE=<interface>

where is the interface you announce your VIP from via BGP. By default it's lo:0 in Equinix Metal.

Using Annotations

This is important as the CCM will apply the BGP configuration to the node annotations making it easy for kube-vip to find the networking configuration it needs to expose load balancer addresses. The --annotations metal.equinix.com will cause kube-vip to "watch" the annotations of the worker node that it is running on, once all of the configuration has been applied by the CCM then the kube-vip pod is ready to advertise BGP addresses for the service.

1kube-vip manifest daemonset \
2  --interface $INTERFACE \
3  --services \
4  --bgp \
5  --annotations metal.equinix.com \
6  --inCluster | k apply -f -

Using the existing CCM secret

Alternatively it is possible to create a daemonset that will use the existing CCM secret to do an API lookup, this will allow for discovering the networking configuration needed to advertise loadbalancer addresses through BGP.

1kube-vip manifest daemonset --interface $INTERFACE \
2--services \
3--inCluster \
4--bgp \
5--metal \
6--provider-config /etc/cloud-sa/cloud-sa.json | kubectl apply -f -

Expose with Equinix Metal CCM

Follow the Equinix Metal's Elastic IP (EIP) document either through the API, CLI or through the UI, to create a public IPv4 EIP address, for example (145.75.75.1) and this is the address you can expose through BGP as the service loadbalancer.

1# metal ip request -p xxx-bbb-ccc -f ams1 -q 1 -t public_ipv4                                                                   
2+-------+---------------+--------+----------------------+
3|   ID  |    ADDRESS    | PUBLIC |       CREATED        |
4+-------+---------------+--------+----------------------+
5| xxxxx | 147.75.75.1   | true   | 2020-11-10T15:57:39Z |
6+-------+---------------+--------+----------------------+
7    
8kubectl expose deployment nginx-deployment --port=80 --type=LoadBalancer --name=nginx --load-balancer-ip=147.75.75.1

Troubleshooting

If kube-vip has been sat waiting for a long time then you may need to investigate that the annotations have been applied correctly by doing running the describe on the node. As of Equinix Metal's CCM v3.3.0, the annotations format was changed. This means, you should expect either of the following:

  1. Equinix Metal's CCM v3.3.0 onwards:
1kubectl describe node k8s.bgp02
2...
3Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
4                    node.alpha.kubernetes.io/ttl: 0
5                    metal.equinix.com/bgp-peers-0-node-asn: 65000
6                    metal.equinix.com/bgp-peers-0-peer-asn: 65530
7                    metal.equinix.com/bgp-peers-0-peer-ip: x.x.x.x
8                    metal.equinix.com/bgp-peers-0-src-ip: x.x.x.x
  1. Equinix Metal's CCM before v3.0.0:
1kubectl describe node k8s.bgp02
2...
3Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
4                    node.alpha.kubernetes.io/ttl: 0
5                    metal.equinix.com/node-asn: 65000
6                    metal.equinix.com/peer-asn: 65530
7                    metal.equinix.com/peer-ip: x.x.x.x
8                    metal.equinix.com/src-ip: x.x.x.x

If there are errors regarding 169.254.255.1 or 169.254.255.2 in the kube-vip logs then the routes to the ToR switches that provide BGP peering may by missing from the nodes. They can be replaced with the below command:

1GATEWAY_IP=$(curl https://metadata.platformequinix.com/metadata | jq -r ".network.addresses[] | select(.public == false) | .gateway")
2ip route add 169.254.255.1 via $GATEWAY_IP
3ip route add 169.254.255.2 via $GATEWAY_IP

Additionally examining the logs of the Equinix Metal's CCM may reveal why the node is not yet ready.