Error #1: Resource Exhausted Troubleshoot and Fix

Regular monitoring and proactive resource management are essential to maintaining a stable Kubernetes environment.

DevOps Diaries

Hey — It's Avinash Tietler 👋

Here you get a use cases,top news, tools, and articles from the DevOps.

IN TODAY'S EDITION

🔬Use Case:

Why am getting "Resource Exhausted" error in Kubernetes?

📺Top News:

📢 Remote Job:

EXL is hiring a DevOps Engineer - Location: Worldwide (Remote)

📑Resource:

USE CASE

Why am getting "Resource Exhausted" error in Kubernetes?

Possible Causes

  1. Insufficient Node Resources: The cluster nodes do not have enough CPU or memory to schedule new pods.

  2. Improper Resource Requests and Limits: Pods are requesting more resources than available or no resource limits are defined, leading to over-utilization.

  3. High Cluster Workload: The cluster is running too many pods, causing resource contention.

  4. Evictions: Nodes evict pods to reclaim resources due to resource pressure.

Step-by-Step Resolution

Identify the Cause

  • Check Pod Events:

    kubectl describe pod <pod-name>

Look for messages like Insufficient CPU or Insufficient memory.

  • Check Node Status:

    kubectl get nodes

    kubectl describe node <node-name>

    Review node resource capacity and utilization.

  • Monitor Resource Usage: Use metrics-server or tools like Prometheus and Grafana to monitor CPU and memory usage across the cluster.

Verify Resource Requests and Limits

  • Ensure that each pod has proper resource requests and limits defined.

    resources:

    requests:

    memory: "128Mi"

    cpu: "500m"

    limits:

    memory: "256Mi"

    cpu: "1000m"

  • Check existing pod configurations:

    kubectl get pods -o yaml

Adjust Resource Requests and Limits

  • Modify Resource Configuration: Update resource requests and limits in the pod specification to match actual needs.

kubectl edit deployment <deployment-name>

Scale the Application

  • If the workload is high, scale up the application using replicas:

    kubectl scale deployment <deployment-name> --replicas=<number>

Add More Nodes to the Cluster

  • For On-Prem Clusters: Add more nodes manually.

For Cloud Providers: Enable cluster auto-scaling. Example (GKE):

gcloud container clusters update <cluster-name> --enable-autoscaling --min-nodes=<min> --max-nodes=<max>

Implement Resource Quotas

  • Limit resource usage per namespace to prevent overuse:

apiVersion: v1 kind: ResourceQuota metadata: name: resource-quota namespace: <namespace> spec: hard: requests.cpu: "4" requests.memory: "8Gi" limits.cpu: "8" limits.memory: "16Gi"

Enable Cluster Autoscaler

  • Use Kubernetes Cluster Autoscaler to dynamically add nodes when resources are exhausted.

Optimize Existing Resources

  • Identify and terminate unused pods.

  • Reduce resource limits for less critical applications.

Monitor and Prevent Future Issues

  • Set up alerting for high resource utilization.

  • Use Vertical Pod Autoscaler (VPA) or Horizontal Pod Autoscaler (HPA) for dynamic scaling.

By identifying the root cause, ensuring proper resource configuration, and scaling your cluster appropriately, you can resolve and prevent "Resource Exhausted" errors. Regular monitoring and proactive resource management are essential to maintaining a stable Kubernetes environment.

Reply

or to participate.