Monitoring K8S Service Resource Usage: A Complete Guide
Overview
Monitoring resource usage of services deployed on Kubernetes is essential for maintaining cluster health, optimizing performance, and ensuring cost efficiency. This guide provides comprehensive methods to track and analyze resource consumption across your K8S environment.
Warning
Always monitor resource usage regularly to prevent resource exhaustion and ensure optimal performance of your applications.
Basic Resource Monitoring
Using kubectl top
The simplest way to check resource usage:
# Check resource usage for all pods
kubectl top pods
# Check resource usage for nodes
kubectl top nodes
# Watch real-time resource usage
kubectl top pods --watch
# Show specific namespace
kubectl top pods -n namespace-name
# Sort by memory usage
kubectl top pods --sort-by=memory
# Sort by CPU usage
kubectl top pods --sort-by=cpuResource Quotas and Limits
# Check resource quotas
kubectl get quota -n namespace-name
# Check resource limits and requests
kubectl get limits -n namespace-name
# Check pod resource requests and limits
kubectl get pods -o custom-columns=NAME:.metadata.name,CPU:.spec.containers[*].resources.requests.cpu,MEM:.spec.containers[*].resources.requests.memory,CPULIMIT:.spec.containers[*].resources.limits.cpu,MEMLIMIT:.spec.containers[*].resources.limits.memoryAdvanced Monitoring Commands
Detailed Resource Analysis
# Get detailed pod resource information
kubectl get pods -o yaml | grep -E "(resources|requests|limits)"
# Check resource usage by namespace
kubectl get ns -o custom-columns=NAME:.metadata.name,RESOURCEREQUESTS:.status.resources.requests.cpu,RESOURCELIMITS:.status.resources.limits.cpu
# Get container-specific resource usage
kubectl top pods --containersNode Resource Analysis
# Get detailed node information with resource usage
kubectl describe nodes
# Check allocatable vs allocated resources
kubectl get nodes -o custom-columns=NAME:.metadata.name,ALLOCATABLECPU:.status.allocatable.cpu,ALLOCATABLEMEM:.status.allocatable.memory,ALLOCATEDCPU:.status.capacity.cpu,ALLOCATEDMEM:.status.capacity.memory
# Get node capacity and allocatable resources
kubectl get nodes -o jsonpath='{.items[*].status.capacity}{"\n"}'
kubectl get nodes -o jsonpath='{.items[*].status.allocatable}{"\n"}'Using K8S Metrics Server
Install and Verify Metrics Server
# Install metrics server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# Check metrics server status
kubectl get deployment -n kube-system metrics-server
kubectl get pods -n kube-system metrics-server-xxxxx
# Verify metrics are available
kubectl top nodes
kubectl top podsMetrics Server Configuration
# Check metrics server logs
kubectl logs -n kube-system deployment/metrics-server
# Configure metrics server (if needed)
kubectl edit deployment -n kube-system metrics-serverUsing Prometheus and Grafana
Prometheus Configuration
# prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: trueGrafana Dashboards
# Deploy Grafana
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install grafana grafana/grafana
# Access Grafana
kubectl port-forward svc/grafana 3000:80
# Import K8S dashboards (ID: 6417, 122, 7362)Resource Usage Analysis
Critical Resource Checks
# Check pods with high resource usage
kubectl top pods --sort-by=cpu | head -10
kubectl top pods --sort-by=memory | head -10
# Check pods exceeding limits
kubectl get pods -n namespace-name --all-namespaces -o custom-columns=NAME:.metadata.name,REQUESTS:.spec.containers[*].resources.requests.cpu,LIMITS:.spec.containers[*].resources.limits.cpu,USAGE:$.spec.containers.resources.requests.cpu
# Check for pods in pending state due to resource constraints
kubectl get pods -n namespace-name --field-selector=status.phase=PendingResource Efficiency Analysis
# Calculate resource utilization
kubectl top nodes | awk '
{
cpu_usage = $3
cpu_total = $4
mem_usage = $5
mem_total = $6
cpu_util = cpu_usage / cpu_total * 100
mem_util = mem_usage / mem_total * 100
printf "Node: %s - CPU: %.1f%% Memory: %.1f%%\n", $1, cpu_util, mem_util
}'
# Check resource requests vs usage
kubectl top pods --no-headers | while read name cpu mem rest; do
requests=$(kubectl get pod $name -o jsonpath='{.spec.containers[*].resources.requests.cpu}')
echo "Pod: $name - CPU: $cpu - Requests: $requests"
doneResource Management Best Practices
Setting Resource Limits
# Example pod with resource limits
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: resource-demo
spec:
containers:
- name: frontend
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- name: backend
image: redis
resources:
requests:
memory: "128Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "1000m"
EOFHorizontal Pod Autoscaler (HPA)
# Create HPA for CPU-based scaling
kubectl autoscale deployment my-app --cpu-percent=70 --min=2 --max=10
# Create HPA for memory-based scaling
kubectl autoscale deployment my-app --cpu-percent=50 --memory-percent=70 --min=2 --max=10
# Check HPA status
kubectl get hpa
# Check HPA details
kubectl describe hpa my-appTroubleshooting Resource Issues
Common Resource Problems
# Check for evicted pods
kubectl get pods --all-namespaces | grep Evicted
# Check for pods in CrashLoopBackOff
kubectl get pods --field-selector=status.phase=Running --all-namespaces | grep -i crash
# Check node resource pressure
kubectl describe node <node-name> | grep -i pressure
# Check for container resource limits
kubectl get pods -o json | jq '.items[].spec.containers[] | {name: .name, requests: .resources.requests, limits: .resources.limits}'Resource Leak Detection
# Monitor memory usage trends
kubectl top pods --watch -n namespace-name
# Check for memory leaks
kubectl get pods -n namespace-name -o jsonpath='{.items[*].status.containerStatuses[*].restartCount}' | tr ' ' '\n' | sort -n | uniq -c
# Check container resource limits
kubectl get pods -n namespace-name -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{range .spec.containers[*]}{.name}{"\t"}{.resources.requests.memory}{"\t"}{.resources.limits.memory}{"\n"}{end}{end}'Automated Monitoring Scripts
Daily Resource Report
#!/bin/bash
# k8s-resource-report.sh
echo "=== K8S Resource Usage Report ==="
echo "Date: $(date)"
echo ""
echo "=== Node Resource Usage ==="
kubectl top nodes | column -t
echo ""
echo "=== Top CPU-Consuming Pods ==="
kubectl top pods --sort-by=cpu | head -10
echo ""
echo "=== Top Memory-Consuming Pods ==="
kubectl top pods --sort-by=memory | head -10
echo ""
echo "=== Pending Pods (Resource Constraints) ==="
kubectl get pods --field-selector=status.phase=Pending
echo ""
echo "=== Evicted Pods ==="
kubectl get pods --all-namespaces | grep EvictedResource Alert Script
#!/bin/bash
# k8s-resource-alert.sh
NAMESPACES=("production" "staging" "development")
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
for namespace in "${NAMESPACES[@]}"; do
echo "Checking namespace: $namespace"
# Check node resource usage
echo "Node Resource Usage:"
kubectl top nodes --no-headers | while read name cpu mem rest; do
cpu_usage=$(echo "$cpu" | sed 's/%//')
mem_usage=$(echo "$mem" | sed 's/%//')
if (( $(echo "$cpu_usage > $CPU_THRESHOLD" | bc -l) )) || (( $(echo "$mem_usage > $MEMORY_THRESHOLD" | bc -l) )); then
echo "WARNING: $node - CPU: $cpu%, Memory: $mem%"
fi
done
# Check pod resource usage
echo "High Resource Usage Pods:"
kubectl top pods -n $namespace --sort-by=cpu | head -5
kubectl top pods -n $namespace --sort-by=memory | head -5
echo ""
doneSecurity Considerations
Resource Security Best Practices
# Set resource limits on critical namespaces
kubectl limitrange -n production --create --limits=cpu=1000,memory=8Gi
# Implement resource quotas
kubectl quota -n production --hard=pods=10,services=5,requests.cpu=1000,memory=8G
# Check resource security policies
kubectl get resourcequota -A
kubectl get limitrange -A
# Monitor resource security violations
kubectl get events --field-selector=reason=FailedScheduling -AChangelog
8/9/25, 3:02 PM
View All Changelog
d321d-feat: introduce to K8Son
Copyright
Copyright Ownership:WARREN Y.F. LONG
License under:Attribution-NonCommercial-NoDerivatives 4.0 International (CC-BY-NC-ND-4.0)