Removed Terminated pods from Kubernetes
So in recent versions of Kubernetes, when pods are killed from a drain event, they go into a state of Failed with the reason being Terminated. They hang out there for a time, but will eventually get GC’d. This can however mess with some stats and it looks kinda scary if you aren’t prepared to see it.
I prefer keeping a clean slate so I run a cron that cleans up these pods using this lovely command:
kubectl get pods --all-namespaces -o go-template='{{range .items}}{{if eq .status.phase "Failed"}}{{if or (eq .status.reason "Terminated") (eq .status.reason "NodeShutdown")}}{{.metadata.namespace}} {{.metadata.name}}{{printf "\n"}}{{end}}{{end}}{{end}}' | xargs -n2 kubectl delete pod -n
Due to this running kubectl it obviously needs some RBAC, and a container that has kubectl available. Here is the full set of manifests, hopefully it helps someone!
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: shutdown-cleaner
rules:
- verbs:
- delete
- get
- list
resources:
- pods
apiGroups:
- ""
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: shutdown-cleaner
subjects:
- kind: ServiceAccount
name: shutdown-cleaner
namespace: kube-system
roleRef:
apiGroup: ""
kind: ClusterRole
name: shutdown-cleaner
---
kind: ServiceAccount
apiVersion: v1
metadata:
namespace: kube-system
name: shutdown-cleaner
---
kind: CronJob
apiVersion: batch/v1beta1
metadata:
name: shutdown-cleaner
namespace: kube-system
spec:
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
suspend: false
concurrencyPolicy: Forbid
jobTemplate:
spec:
parallelism: 1
completions: 1
backoffLimit: 3
activeDeadlineSeconds: 60
template:
spec:
serviceAccountName: shutdown-cleaner
containers:
- name: shutdown-cleaner
image: bitnami/kubectl:1.21.5
command:
- /bin/sh
- -c
- kubectl get pods --all-namespaces -o go-template='{{range .items}}{{if eq .status.phase "Failed"}}{{if or (eq .status.reason "Terminated") (eq .status.reason "NodeShutdown")}}{{.metadata.namespace}} {{.metadata.name}}{{printf "\n"}}{{end}}{{end}}{{end}}' | xargs -n2 kubectl delete pod -n
restartPolicy: Never
schedule: "0 * * * *"
Read other posts