IP routes

Kubernetes & IPVS

In this article, we explain the IPVS feature now available on Kubernetes (1.9 and later).

Intended audience: sys admins learning k8s or working with Kubernetes. Basic knowledge of Kubernetes architecture and workflows is recommended to fully understand the benefits of IPVS.

By Flavien Hardy, Cloud Consultant @ Objectif Libre

What is IPVS?

IPVS is a kernel feature providing layer 4 load balancing. It is also called Layer 4 switching. A stable version of IPVS is available since Linux 2.6.

In a nutshell, IPVS is used to expose an entrypoint service with a unique virtual IP. All TCP/UPD traffic going through this endpoint is load-balanced between physical servers.

How does this work with Kubernetes?

The IPVS definition is also the description of a ClusterIP service in Kubernetes.

                                             __
---------------------      --------------   /    POD1
[ Incomming traffic ] ---> [ Service IP ] ->---  POD2
---------------------      --------------   \__  POD3

In the previous versions of Kubernetes, services (managed by kube-proxy) were implemented with IPTables rules. The IPVS feature is intended to replace this mechanism: instead of setting new iptables rules, kube-proxy can now use the IPVS mechanism to implement the services.

IPVS Vs IPTables?

As stated before, the default service implementation in Kubernetes uses IPTables. Large deployements (5000+ services) reach the IPTables limits:

  • Low performance regarding packet processing
  • Low performance for new rules insertion

These performance issues are due to IPTables and Netfilter implementation: rules are evaluated sequentially for each incoming packet. The more rules there are, the longer the processing takes. The IPVS implementation differs from IPTables: it uses a hash table managed by the kernel to establish the destination of a packet.

Firewall management is the first use case for IPTables but, in case of massive packet processing, performances collapse.

The following measurements (realized and provided by Haibin Xie) show the performance differences between IPVS and IPTables:

Metrics Number of services IPVS IPTables
Service access time
1.000
10.000
50.000
10ms
9ms
9ms
7-18ms
80-7000ms
Non-fonctionnel
Memory usage
1.000
10.000
50.000
386 MB
542 MB
1272 MB
1.1G
2.3G
OOM
CPU usage
1.000
10.000
50.000
0%
N/A
50%-100%
N/A

How to setup IPVS?

IPVS is provided as a beta feature in current Kubernetes 1.9. To use it you must enable the SupportIPVSProxyMode feature gate.

If you deploy your cluster with Kubespray, add the following parameter in the k8s-cluster.yml configuration file:

kube_proxy_mode: ipvs

Impact (kube-proxy only):

  • Enables the SupportIPVSProxyMode feature gate
  • IPVS proxy
  • Loads additional kernel modules (ip_vs_rr, ip_vs__wrr, ip_vs_sh, nf_conntrack_ipv4)

The IPVS feature will be declared as stable in Kubernetes 1.10 (https://github.com/kubernetes/kubernetes/pull/58442).

Additional benefits

Load balancing

IPVS for kube-proxy allows the administrator to choose between the most common load balancing methods: round robin (default), least connection, destination hashing,…

Currently, the load balancing method cannot be changed for a specific service (related GitHub issue).

As of today, if you want to update the defaut load balancing method in Kubespray, you must update the kube-proxy manifest template

Administration

IPVS comes with a handy CLI: ipvsadm. It is much more efficient than shell tricks like iptables -L -t nat | grep PATTERN.

Example:

~ # ipvsadm -l -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.233.0.1:443 rr persistent 10800
  -> 195.154.162.187:6443         Masq    1      0          0
  -> 195.154.165.191:6443         Masq    1      0          0
  -> 62.210.115.35:6443           Masq    1      2          0

[...]

This shows the IPVS rules for the in-cluster API server service  default/kubernetes:443. The virtual IP is 10.233.0.1:443, incoming TCP packets are load balanced between three Kubernetes master nodes (IP:6443) using round-robin.

Resources: