Learn how to enable CPU and Memory HPA (Horizontal Pod Autoscaler) in Kubernetes to automatically scale your deployments based on resource utilization. This step-by-step guide includes Metrics Server installation, deployment creation, and HPA configuration.
Horizontal Pod Autoscaler (HPA) is a critical component in Kubernetes that automatically scales the number of pods in a replication controller, deployment, or replica set based on observed CPU utilization or other custom metrics. In this article, we will discuss how to enable CPU and memory-based HPA in Kubernetes.
Prerequisites
Before proceeding, ensure that you have the following prerequisites:
- A Kubernetes cluster is up and running.
kubectl
the command-line tool installed and configured to interact with your cluster.
Enabling Metrics Server
To enable HPA based on CPU and memory metrics, you must first deploy the Metrics Server in your cluster. Metrics Server collects resource metrics from Kubelets and exposes them via the Kubernetes API.
Deploy Metrics Server
To deploy Metrics Server, apply the following YAML manifest:
https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Verify Metrics Server Installation
To verify that Metrics Server is running, execute the following command:
kubectl get deployment metrics-server -n kube-system
Creating a Deployment
To demonstrate the HPA functionality, we will create a simple deployment using the nginx
image.
Create a Deployment YAML
Create a file named nginx-deployment.yaml
with the following content:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21.0
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
Apply the Deployment
Apply the deployment using the following command:
kubectl apply -f nginx-deployment.yaml
Enable CPU and Memory HPA
Now that we have our deployment up and running, let’s create an HPA configuration to scale based on CPU and memory.
Create an HPA YAML
Create a file named nginx-hpa.yaml
with the following content:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
maxReplicas: 10
metrics:
- resource:
name: cpu
target:
averageUtilization: 70
type: Utilization
type: Resource
- resource:
name: memory
target:
type: AverageValue
averageValue: 128Mi
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
HPA Use Cases
Few use cases that demonstrate when and how CPU and memory-based HPA can be beneficial in Kubernetes:
Use Case | Description |
---|---|
E-commerce website | During peak times like sales events or holidays, an e-commerce website might experience a surge in traffic, necessitating more resources to handle requests. |
Media streaming service | A media streaming service needs to scale up when there is an increase in concurrent users streaming content to maintain seamless performance. |
Data processing pipeline | Data processing pipelines may require additional resources when processing large volumes of data, especially during peak data ingestion periods. |
Multi-tenant applications | In a multi-tenant application, the varying load from different tenants may require dynamic scaling based on CPU and memory utilization. |
Online gaming platform | An online gaming platform may experience fluctuations in user count throughout the day, making it essential to scale up or down based on resource usage. |
Microservices architecture | In a microservices-based system, each service might require dynamic scaling based on the workload, ensuring efficient resource allocation and usage. |
These use cases illustrate how the HPA can automatically scale the number of pods in a deployment based on CPU and memory utilization, ensuring optimal performance and efficient resource usage in Kubernetes.
Follow us on LinkedIn for updates!
3 thoughts on “How to Enable CPU and Memory HPA in Kubernetes”