Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV gets stuck intermittently terminating #1375

Open
EoinMan opened this issue May 3, 2024 · 9 comments
Open

PV gets stuck intermittently terminating #1375

EoinMan opened this issue May 3, 2024 · 9 comments

Comments

@EoinMan
Copy link

EoinMan commented May 3, 2024

What happened:

Upgraded AKS Clusters to 1.28.5

Now getting

Warning VolumeFailedDelete 2m8s persistentvolume-controller error getting deleter volume plugin for volume "PODNAME": no deletable volume plugin matched

Re-install SC
Re-install Drivers

Works for a couple of runs

Then the message re-appears

What you expected to happen:

On Helm uninstall, PV should not get stuck with the error

How to reproduce it:

Anything else we need to know?:

Helm version is v3.14.4
az version 2.60.0

Environment:

  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.24.0
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.24.0
  imageID: mcr.microsoft.com/oss/kubernetes-csi/blob-csi@sha256:eb1cbe0e41106c941db15eb6382ac84aa7b389c4e9b4e1d0e307a2d0043c06a7
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.24.0
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.24.0
  imageID: mcr.microsoft.com/oss/kubernetes-csi/blob-csi@sha256:eb1cbe0e41106c941db15eb6382ac84aa7b389c4e9b4e1d0e307a2d0043c06a7
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  image: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
  • Kubectl version 1.26.1
  • OS Ubuntu Linux
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@EoinMan
Copy link
Author

EoinMan commented May 3, 2024

never mind

I did not see

Release 2024-02-07

Bug Fixes

Enable HonorPVReclaimPolicy for CSI drivers on AKS 1.27+ to align with upstream behavior.

@EoinMan EoinMan closed this as completed May 3, 2024
@EoinMan EoinMan reopened this May 3, 2024
@EoinMan
Copy link
Author

EoinMan commented May 3, 2024

Sorry still getting the issues

17m Normal RegisteredNode Node/aks-worker-23918152-vmss000011 Node aks-worker-23918152-vmss000011 event: Registered Node aks-worker-23918152-vmss000011 in Controller
17m Normal RegisteredNode Node/aks-worker-23918152-vmss00000b Node aks-worker-23918152-vmss00000b event: Registered Node aks-worker-23918152-vmss00000b in Controller
17m Normal RegisteredNode Node/aks-tef-35839376-vmss00004j Node aks-tef-35839376-vmss00004j event: Registered Node aks-tef-35839376-vmss00004j in Controller
17m Normal RegisteredNode Node/aks-ingress-32624845-vmss00001u Node aks-ingress-32624845-vmss00001u event: Registered Node aks-ingress-32624845-vmss00001u in Controller
2m50s Warning VolumeFailedDelete PersistentVolume/-persistence-pv error getting deleter volume plugin for volume "-persistence-pv": no deletable volume plugin matched

@EoinMan
Copy link
Author

EoinMan commented Jun 4, 2024

Just wondering if might be looked at at some point

Thank you

@andyzhangx
Copy link
Member

are you using the open source or managed blob csi driver? @EoinMan

@EoinMan
Copy link
Author

EoinMan commented Jun 10, 2024

Aks1 K8s == 1.28

csiDriver: v1.29.4

7 mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.29.4
7 mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.29.4
21 mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
21 mcr.microsoft.com/oss/kubernetes-csi/csi-node-driver-registrar:v2.10.0
21 mcr.microsoft.com/oss/kubernetes-csi/livenessprobe:v2.12.0

AKs2 K8s == 1.28

csiDriver: v1.29.4

3 mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.29.4
3 mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.29.4
9 mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.23.4
9 mcr.microsoft.com/oss/kubernetes-csi/csi-node-driver-registrar:v2.10.0
9 mcr.microsoft.com/oss/kubernetes-csi/livenessprobe:v2.12.0

I think its the managed one,

@andyzhangx
Copy link
Member

what do you mean Re-install Drivers? you can not uninstall driver first, and then delete pod with the blob volume

@EoinMan
Copy link
Author

EoinMan commented Jun 10, 2024

So Drivers are installed on the cluster

When doing a helm uninstall/delete I get the above error

The pod is stuck in a terminating state

I can get around or out of the terminating state by deleting the finalizers block of code in the PV

If I re-installed the driver on the cluster, I could helm install and helm delete 3/4 times before the issue arose again

I could understand the PV hanging if a pod still had a claim
But with helm uninstall all pods and claims are gone, and this is not pointing to an error on the pods but more with the driver itself

@andyzhangx
Copy link
Member

you could not delete pv if pod or pvc still exists

@EoinMan
Copy link
Author

EoinMan commented Jun 12, 2024

I understand that stand

The helm uninstall deletes everything

The PV is left at the very end stuck in the terminating state

This causes issues deploying into that NS again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants