Ensuring Zero Downtime: A Guide to Pod Termination in Kubernetes
Introduction
Just as Kubernetes smoothly handles pod creation, it also manages pod termination to achieve zero downtime. Consider an application running with 20 pods during peak times but scaling down to 2 pods during low traffic periods due to the minimum replica count being set to 2. While it might seem straightforward to delete the extra pods, abrupt termination could lead to downtime, causing users to experience disruptions while accessing the service. Let's understand this in detail.
What is a Pod in Kubernetes?
Before explaining pod termination, let's get an overview of what a pod is. A pod is the smallest and simplest Kubernetes object. It represents a single instance of a running process in your cluster and can contain one or more containers that share the same network namespace and storage.
Reasons for Pod Termination
- Scaling Down: Reducing the number of running instances during low traffic.
- Rolling Updates: Replacing old pods with new ones during updates.
- Manual Deletion: Manual deletion by users.
- Node Failure: Terminating and rescheduling pods on failing nodes.
- Resource Constraints: Terminating pods due to CPU or memory limits.
Kubernetes Pod Termination Process
Key components involved in pod termination include:
- Kubelet: Manages the state of each node, reporting pod details to the control plane and checking for updates.
- Endpoint: Managed by Kube-proxy to set up IP table rules on nodes, updating IP addresses and ports as changes occur.
- SIGTERM Signal: Sent by the control plane to the pod to initiate a graceful shutdown. The application stops accepting new requests and completes ongoing tasks. By default, Kubernetes allocates 30 seconds for this process, which can be configured.
- Grace Period: The waiting period after the SIGTERM signal before sending the SIGKILL signal for force termination if the pod doesn’t shut down.
- SIGKILL Signal: Sent if the pod remains active after the grace period, forcing immediate termination.
- Updating Endpoints: Kubelet updates the control plane with the pod's terminated status, and Kube-proxy adjusts IP table rules accordingly.
Pod Eviction Process
Upon pod eviction initiation, the API server updates the pod’s status in the etcd database to Terminating, triggering the termination process.
Kubelet’s Role:
- Detects the Terminating state.
- Sends a SIGTERM signal to the main process in the pod, allowing the application to perform the necessary cleanup.
- Waits for the grace period defined by terminationGracePeriodSeconds. Sends a SIGKILL signal if the pod doesn't terminate within this period.
Endpoints-Controller’s Role:
- Manages service endpoints in the Kubernetes control plane.
- Removes the terminating pod from service endpoints, preventing new traffic from being routed to it.
Timing of SIGTERM Signal and Endpoint Removal
If the endpoints-controller removes the pod from service endpoints before the kubelet sends the SIGTERM signal, the pod can terminate gracefully without impacting incoming traffic. Conversely, if the SIGTERM signal is sent before endpoint removal, new requests might be routed to a terminating pod, potentially causing downtime.
Graceful Shutdown
To ensure pods terminate gracefully, it’s important to close all persistent connections (e.g., databases, queues, websockets) and wait for active requests to complete. This can be achieved using Kubernetes pre-stop hooks and setting the terminationGracePeriodSeconds parameter.
Implementing Pre-Stop Hooks:
- Pre-stop Hook: Runs before pod termination to introduce a delay, providing buffer time for ongoing requests to complete and for the pod’s endpoint to be removed from service.
- Termination Grace Period: The interval Kubernetes waits after the SIGTERM signal before sending the SIGKILL signal, giving the application enough time for cleanup.
Conclusion
Implementing pre-stop hooks and setting an appropriate termination grace period ensures pods terminate gracefully, maintaining application availability and performance.