Wei Zhou 69e8ebc03f
CKS: retry if unable to drain node or unable to upgrade k8s node (#8402)
* CKS: retry if unable to drain node or unable to upgrade k8s node

I tried CKS upgrade 16 times, 11 of 16 upgrades succeeded.

2 of 16 upgrades failed due to
```
error: unable to drain node "testcluster-of7974-node-18c8c33c2c3" due to error:[error when evicting pods/"cloud-controller-manager-5b8fc87665-5nwlh" -n "kube-system": Post "https://10.0.66.18:6443/api/v1/namespaces/kube-system/pods/cloud-controller-manager-5b8fc87665-5nwlh/eviction": unexpected EOF, error when evicting pods/"coredns-5d78c9869d-h5nkz" -n "kube-system": Post "https://10.0.66.18:6443/api/v1/namespaces/kube-system/pods/coredns-5d78c9869d-h5nkz/eviction": unexpected EOF], continuing command...
```

3 of 16 upgrades failed due to
```
Error from server: error when retrieving current configuration of:
Resource: "rbac.authorization.k8s.io/v1, Resource=roles", GroupVersionKind: "rbac.authorization.k8s.io/v1, Kind=Role"
Name: "kubernetes-dashboard", Namespace: "kubernetes-dashboard"
from server for: "/mnt/k8sdisk//dashboard.yaml": etcdserver: leader changed
```

* CKS: remove tests of creating/deleting HA clusters as they are covered by the upgrade test

* Update PR 8402 as suggested

* test: remove CKS cluster if fail to create or verify
2024-02-06 11:14:10 +01:00
..