ThinkChat🤖让你学习和工作更高效,注册即送10W Token,即刻开启你的AI之旅 广告
[TOC] kubelet 服务对磁盘检查是有两个参数的,分别是 `imagefs` 与 `nodefs`。其中 - imagefs:监控docker启动参数 `data-root 或者 graph` 目录所在的分区。默认`/var/lib/docker` - nodefs:监控kubelet启动参数 `--root-dir` 指定的目录所在分区。默认`/var/lib/kubelet` ## 环境说明 kubernetes版本 ```shell $ kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master01 Ready master 85d v1.18.18 k8s-master02 Ready master 85d v1.18.18 k8s-node01 Ready <none> 85d v1.18.18 k8s-node02 Ready <none> 85d v1.18.18 k8s-node03 Ready <none> 85d v1.18.18 ``` 节点状态 ```shell Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Wed, 01 Dec 2021 11:39:29 +0800 Wed, 01 Dec 2021 11:39:29 +0800 CalicoIsUp Calico is running on this node MemoryPressure False Wed, 01 Dec 2021 13:59:51 +0800 Wed, 01 Dec 2021 11:39:25 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 01 Dec 2021 13:59:51 +0800 Wed, 01 Dec 2021 11:39:25 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 01 Dec 2021 13:59:51 +0800 Wed, 01 Dec 2021 11:39:25 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 01 Dec 2021 13:59:51 +0800 Wed, 01 Dec 2021 11:39:25 +0800 KubeletReady kubelet is posting ready status ``` docker数据目录 ```shell $ docker info | grep "Docker Root Dir" Docker Root Dir: /data/docker/data ``` kubelet数据目录 ```shell $ ps -ef | grep kubelet /data/k8s/bin/kubelet --alsologtostderr=true --logtostderr=false --v=4 --log-dir=/data/k8s/logs/kubelet --hostname-override=k8s-master01 --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --kubeconfig=/data/k8s/certs/kubelet.kubeconfig --bootstrap-kubeconfig=/data/k8s/certs/bootstrap.kubeconfig --config=/data/k8s/conf/kubelet-config.yaml --cert-dir=/data/k8s/certs/ --root-dir=/data/k8s/data/kubelet/ --pod-infra-container-image=ecloudedu/pause-amd64:3.0 ``` 分区使用率 ```shell $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 40G 8.8G 32G 23% / /dev/sdb 40G 1.9G 39G 10% /data/docker/data ... ``` ## 验证方案 1. 验证nodefs超过阈值 2. 验证imagefs超过阈值 3. 验证imagefs和nodefs超过阈值 ### 验证nodefs超过阈值 `kubelet` 的 `--root-dir` 参数在所分区(/)已使用23%,现在修改imagefs的阈值为78%,node应该nodefs超标。 ```yaml evictionHard: memory.available: 10% nodefs.available: 78% nodefs.inodesFree: 10% imagefs.available: 10% imagefs.inodesFree: 10% ``` 然后我们查看节点的状态,Attempting to reclaim ephemeral-storage,意思为尝试回收磁盘空间 ```shell $ kubectl describe node k8s-master01 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Wed, 01 Dec 2021 14:18:56 +0800 Wed, 01 Dec 2021 14:18:56 +0800 CalicoIsUp Calico is running on this node MemoryPressure False Wed, 01 Dec 2021 15:03:52 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure True Wed, 01 Dec 2021 15:03:52 +0800 Wed, 01 Dec 2021 14:56:13 +0800 KubeletHasDiskPressure kubelet has disk pressure PIDPressure False Wed, 01 Dec 2021 15:03:52 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 01 Dec 2021 15:03:52 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletReady kubelet is posting ready status ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 6m45s kubelet Starting kubelet. Normal NodeAllocatableEnforced 6m45s kubelet Updated Node Allocatable limit across pods Normal NodeHasSufficientMemory 6m45s kubelet Node k8s-master01 status is now: NodeHasSufficientMemory Normal NodeHasDiskPressure 6m45s kubelet Node k8s-master01 status is now: NodeHasDiskPressure Normal NodeHasSufficientPID 6m45s kubelet Node k8s-master01 status is now: NodeHasSufficientPID Warning EvictionThresholdMet 105s (x31 over 6m45s) kubelet Attempting to reclaim ephemeral-storage ``` ### 验证imagefs超过阈值 `docker` 存储目录(/data/docker/data)在所分区已使用10%,现在修改imagefs的阈值为91%,node应该imagefs超标。 ```yaml evictionHard: memory.available: 10% nodefs.available: 10% nodefs.inodesFree: 10% imagefs.available: 91% imagefs.inodesFree: 10% ``` 然后我们查看节点的状态,Attempting to reclaim ephemeral-storage,意思为尝试回收磁盘空间 ```shell $ kubectl describe node k8s-master01 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Wed, 01 Dec 2021 14:18:56 +0800 Wed, 01 Dec 2021 14:18:56 +0800 CalicoIsUp Calico is running on this node MemoryPressure False Wed, 01 Dec 2021 15:17:31 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure True Wed, 01 Dec 2021 15:17:31 +0800 Wed, 01 Dec 2021 14:56:13 +0800 KubeletHasDiskPressure kubelet has disk pressure PIDPressure False Wed, 01 Dec 2021 15:17:31 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 01 Dec 2021 15:17:31 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletReady kubelet is posting ready status ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NodeHasSufficientPID 18s kubelet Node k8s-master01 status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 18s kubelet Updated Node Allocatable limit across pods Warning EvictionThresholdMet 18s kubelet Attempting to reclaim ephemeral-storage Normal NodeHasSufficientMemory 18s kubelet Node k8s-master01 status is now: NodeHasSufficientMemory Normal NodeHasDiskPressure 18s kubelet Node k8s-master01 status is now: NodeHasDiskPressure Normal Starting 18s kubelet Starting kubelet. ``` ### 验证imagefs和nodefs同时超过阈值 现在修改imagefs的阈值为91%和nodefs的阈值为78%,node应该imagefs和nodefs超标。 ```yaml evictionHard: memory.available: 10% nodefs.available: 78% nodefs.inodesFree: 10% imagefs.available: 91% imagefs.inodesFree: 10% ``` 然后我们查看节点的状态,Attempting to reclaim ephemeral-storage,意思为尝试回收磁盘空间 ```shell $ kubectl describe node k8s-master01 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Wed, 01 Dec 2021 14:18:56 +0800 Wed, 01 Dec 2021 14:18:56 +0800 CalicoIsUp Calico is running on this node MemoryPressure False Wed, 01 Dec 2021 15:23:03 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure True Wed, 01 Dec 2021 15:23:03 +0800 Wed, 01 Dec 2021 15:23:03 +0800 KubeletHasDiskPressure kubelet has disk pressure PIDPressure False Wed, 01 Dec 2021 15:23:03 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 01 Dec 2021 15:23:03 +0800 Wed, 01 Dec 2021 14:14:34 +0800 KubeletReady kubelet is posting ready status ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 2m9s kubelet Starting kubelet. Normal NodeHasSufficientPID 2m9s kubelet Node k8s-master01 status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 2m9s kubelet Updated Node Allocatable limit across pods Normal NodeHasSufficientMemory 2m9s kubelet Node k8s-master01 status is now: NodeHasSufficientMemory Normal NodeHasDiskPressure 2m7s (x2 over 2m9s) kubelet Node k8s-master01 status is now: NodeHasDiskPressure Warning EvictionThresholdMet 8s (x13 over 2m9s) kubelet Attempting to reclaim ephemeral-storage ``` ## 总结 1. nodefs是--root-dir目录所在分区,imagefs是docker安装目录所在的分区 2. 建议nodefs与imagefs共用一个分区,但是这个分区要设置的大一些。 3. 当nodefs与imagefs共用一个分区时,kubelet中的其他几个参数--root-dir、--cert-dir