Troubleshooting ARP

Failover times

The "average" failover time that is observed with kube-vip is around ~3 seconds, however this can wildly depend on the underlying infrastructure such as virtual and physical switches blocking or limiting kube-vip from updating the network. The most simple test that can be performed to begin working out how your infrastructure is performing is the following:

NOTE: If you have set kube-vip to watch a specific namespace then you will need to ensure that this deployment also deploys there by adding -n <namespace>

Deploy a simple nginx application:

kubectl apply -f https://k8s.io/examples/application/deployment.yaml

Create a loadbalancer service:

kubectl expose deployment nginx-deployment --port=80 --type=LoadBalancer --name=nginx

Get the service address:

kubectl get svc nginx
NAME    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
nginx   LoadBalancer   10.102.42.152   192.168.0.218   80:32372/TCP   10m

Set the IP address to a variable for the below tests (change the address for your purposes):

export IP=192.168.0.218

We can now test against our service address!

The below snippet will create a visual representation of availability of the application, as pods are deleted a dot will be printed every second to show unavailability

while true; do curl --output /dev/null --silent --head --fail --connect-timeout 0.1 $IP; if [ $? -ne 0 ]; then   echo -n ".";   DOWN=true; else   if [ $DOWN = true ]; then   echo "";   DOWN=false;   fi; fi; sleep 1; done

Additionally we can test access to the VIP with the ping command, this will demonstrate the IP address either being no longer available or being re-assigned to a new host.

ping -D $IP

Testing access

With this simple monitoring in progress we can watch how long the both kubernetes and kube-vip typically take to reconcile!

Delete a backend pod:

kubectl delete pod $(kubectl get pods | grep nginx-deployment | awk '{ print $1 }')

Doing this a few times should result in something like the following:

..
..
.
.
..
.

VIP Preservation Feature

If you have enabled the PreserveVIPOnLeadershipLoss feature (vip_preserve_on_leadership_loss=true), VIP transitions behave differently:

Expected Behavior with VIP Preservation

When a node loses leadership, it keeps the VIP on its interface but stops ARP/NDP broadcasting
The VIP is only removed when a new leader successfully takes over
This provides more graceful failover with potentially shorter disruption windows

Troubleshooting VIP Preservation

Check if the feature is enabled:

1kubectl get pod <kube-vip-pod> -n kube-system -o yaml | grep vip_preserve_on_leadership_loss

Check leader transition logs:

1kubectl logs -n kube-system <kube-vip-pod> | grep -E "preserve|leadership|took over"

Log messages when feature is enabled:

"VIP addresses remain on interface, only stopped ARP/NDP broadcasting"
"took over VIP as new leader"
"cleaned up preserved VIP to avoid conflict"

IPv6 Special Case: Note that IPv6 VIPs are always removed immediately (even when enabled) to prevent Duplicate Address Detection (DAD) failures. If you're using IPv6 VIPs, they will not be preserved on leadership loss. (Ref: RFC 4429)