kubernetes
remove a control plane from etcd after delete the node
if you delete a control plane node and receive this error on re-creating
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://10.10.0.3:2379 with maintenance client: context deadline exceededyou need to remove the old node from etcd. to do this, connect to another control plane and do this:
- list all nodes, and copy the id of the deleted node
bash
$ kubectl exec node01 -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list
36f7f01d71a2114, started, node01, https://10.10.0.1:2380, https://10.10.0.1:2379, false
22fa48691f987ab9, started, node02, https://10.10.0.2:2380, https://10.10.0.2:2379, false
4bad0a237bfeb2c6, started, node03, https://10.10.0.3:2380, https://10.10.0.3:2379, falsein this case, the old node is node03, with id 4bad0a237bfeb2c6.
- remove it
bash
$ kubectl exec etcd-prd-sms-dc03-z2-kubernetes-controlplane04 -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member remove 4bad0a237bfeb2c6
Member 4bad0a237bfeb2c6 removed from cluster d64f0c95b8a1f61d