[TOC]
### **Pod的几个阶段**
在这个Measurement中,Pod被分为了如下四个阶段:
* create:Pod被创建的时间
* schedule:Pod被成功调度的时间
* run:Pod成功运行的时间(每个container都要是ready状态)
* watch:Pod被watch的时间
在SLO的定义中,Pod的启动时间被定义为从创建到运行,以及被watch,也就是 `watchTime - createTime`。在clusterloader的源码中,也的确是这么计算的,如下(pod_startup):
```
var podStartupTransitions = map[string]measurementutil.Transition{
"create_to_schedule": {
From: createPhase,
To: schedulePhase,
},
"schedule_to_run": {
From: schedulePhase,
To: runPhase,
},
"run_to_watch": {
From: runPhase,
To: watchPhase,
},
"schedule_to_watch": {
From: schedulePhase,
To: watchPhase,
},
"pod_startup": {
From: createPhase,
To: watchPhase,
},
}
```
前三个阶段还好理解,第四个阶段watch是意思?我们先用curl来watch一下Pod创建到运行的全过程:
先在master主机上执行如下命令,通过kubectl的proxy功能对外代理kube-apiserver:
```
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
```
然后在master主机上的另一个shell终端,执行curl命令,监听default命名空间中名字为test的Pod的事件:
```
$ curl http://127.0.0.1:8001/api/v1/watch/namespaces/default/pods/test
```
然后,创建一个名字为test的Pod
```
$ kubectl run test --image harbor.ccse.io:8021/kubernetes/pause:3.6
```
然后,我们的curl程序就会监听到如下的事件:
```
{"type":"ADDED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"test","namespace":"default","uid":"f3038772-4b56-4a1f-87da-f7d0d786532a","resourceVersion":"2988095","creationTimestamp":"2022-09-01T02:14:48Z","labels":{"run":"test"},"managedFields":[{"manager":"kubectl-run","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"test\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}}]},"spec":{"volumes":[{"name":"kube-api-access-bsbjw","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"test","image":"harbor.ccse.io:8021/kubernetes/pause:3.6","resources":{},"volumeMounts":[{"name":"kube-api-access-bsbjw","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","qosClass":"BestEffort"}}}
{"type":"MODIFIED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"test","namespace":"default","uid":"f3038772-4b56-4a1f-87da-f7d0d786532a","resourceVersion":"2988096","creationTimestamp":"2022-09-01T02:14:48Z","labels":{"run":"test"},"managedFields":[{"manager":"kubectl-run","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"test\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}}]},"spec":{"volumes":[{"name":"kube-api-access-bsbjw","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"test","image":"harbor.ccse.io:8021/kubernetes/pause:3.6","resources":{},"volumeMounts":[{"name":"kube-api-access-bsbjw","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"10.35.20.5","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"}],"qosClass":"BestEffort"}}}
{"type":"MODIFIED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"test","namespace":"default","uid":"f3038772-4b56-4a1f-87da-f7d0d786532a","resourceVersion":"2988098","creationTimestamp":"2022-09-01T02:14:48Z","labels":{"run":"test"},"managedFields":[{"manager":"Go-http-client","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:startTime":{}}},"subresource":"status"},{"manager":"kubectl-run","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"test\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}}]},"spec":{"volumes":[{"name":"kube-api-access-bsbjw","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"test","image":"harbor.ccse.io:8021/kubernetes/pause:3.6","resources":{},"volumeMounts":[{"name":"kube-api-access-bsbjw","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"10.35.20.5","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"},{"type":"Ready","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z","reason":"ContainersNotReady","message":"containers with unready status: [test]"},{"type":"ContainersReady","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z","reason":"ContainersNotReady","message":"containers with unready status: [test]"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"}],"hostIP":"10.35.20.5","startTime":"2022-09-01T02:14:48Z","containerStatuses":[{"name":"test","state":{"waiting":{"reason":"ContainerCreating"}},"lastState":{},"ready":false,"restartCount":0,"image":"harbor.ccse.io:8021/kubernetes/pause:3.6","imageID":"","started":false}],"qosClass":"BestEffort"}}}
{"type":"MODIFIED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"test","namespace":"default","uid":"f3038772-4b56-4a1f-87da-f7d0d786532a","resourceVersion":"2988101","creationTimestamp":"2022-09-01T02:14:48Z","labels":{"run":"test"},"annotations":{"cni.projectcalico.org/containerID":"5d74239bea601a89ec85ce46c1af510bf8a51d548c5e8239fc89b59e13cab32c","cni.projectcalico.org/podIP":"10.10.146.67/32","cni.projectcalico.org/podIPs":"10.10.146.67/32"},"managedFields":[{"manager":"Go-http-client","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:startTime":{}}},"subresource":"status"},{"manager":"kubectl-run","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"test\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}},{"manager":"calico","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:cni.projectcalico.org/containerID":{},"f:cni.projectcalico.org/podIP":{},"f:cni.projectcalico.org/podIPs":{}}}},"subresource":"status"}]},"spec":{"volumes":[{"name":"kube-api-access-bsbjw","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"test","image":"harbor.ccse.io:8021/kubernetes/pause:3.6","resources":{},"volumeMounts":[{"name":"kube-api-access-bsbjw","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"10.35.20.5","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"},{"type":"Ready","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z","reason":"ContainersNotReady","message":"containers with unready status: [test]"},{"type":"ContainersReady","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z","reason":"ContainersNotReady","message":"containers with unready status: [test]"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"}],"hostIP":"10.35.20.5","startTime":"2022-09-01T02:14:48Z","containerStatuses":[{"name":"test","state":{"waiting":{"reason":"ContainerCreating"}},"lastState":{},"ready":false,"restartCount":0,"image":"harbor.ccse.io:8021/kubernetes/pause:3.6","imageID":"","started":false}],"qosClass":"BestEffort"}}}
{"type":"MODIFIED","object":{"kind":"Pod","apiVersion":"v1","metadata":{"name":"test","namespace":"default","uid":"f3038772-4b56-4a1f-87da-f7d0d786532a","resourceVersion":"2988108","creationTimestamp":"2022-09-01T02:14:48Z","labels":{"run":"test"},"annotations":{"cni.projectcalico.org/containerID":"5d74239bea601a89ec85ce46c1af510bf8a51d548c5e8239fc89b59e13cab32c","cni.projectcalico.org/podIP":"10.10.146.67/32","cni.projectcalico.org/podIPs":"10.10.146.67/32"},"managedFields":[{"manager":"kubectl-run","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"test\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}},{"manager":"Go-http-client","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:phase":{},"f:podIP":{},"f:podIPs":{".":{},"k:{\"ip\":\"10.10.146.67\"}":{".":{},"f:ip":{}}},"f:startTime":{}}},"subresource":"status"},{"manager":"calico","operation":"Update","apiVersion":"v1","time":"2022-09-01T02:14:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:cni.projectcalico.org/containerID":{},"f:cni.projectcalico.org/podIP":{},"f:cni.projectcalico.org/podIPs":{}}}},"subresource":"status"}]},"spec":{"volumes":[{"name":"kube-api-access-bsbjw","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"test","image":"harbor.ccse.io:8021/kubernetes/pause:3.6","resources":{},"volumeMounts":[{"name":"kube-api-access-bsbjw","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"10.35.20.5","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Running","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"},{"type":"Ready","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:50Z"},{"type":"ContainersReady","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:50Z"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-09-01T02:14:48Z"}],"hostIP":"10.35.20.5","podIP":"10.10.146.67","podIPs":[{"ip":"10.10.146.67"}],"startTime":"2022-09-01T02:14:48Z","containerStatuses":[{"name":"test","state":{"running":{"startedAt":"2022-09-01T02:14:50Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"harbor.ccse.io:8021/kubernetes/pause:3.6","imageID":"docker-pullable://harbor.ccse.io:8021/kubernetes/pause@sha256:74bf6fc6be13c4ec53a86a5acf9fdbc6787b176db0693659ad6ac89f115e182c","containerID":"docker://43c5fe83200b0ee5852d81c5cd9d61defdf1a7083a6f413fa5d83cc597c08542","started":true}],"qosClass":"BestEffort"}}}
```
**注意:这里的事件和Kubernetes中的Event资源(kubectl get event)不是同一个概念**。
可以看到,这里的事件有两个字段:type和object。type的取值有ADDED(资源对象被创建)、MODIFIED(资源对象被更改)、DELETE(资源对象被删除)。object就是被监听的test这个Pod的完整json(不过,我们通过kubectl get pod test -o json看到的内容,是没有metadata.managedFields的),managedFields表示该事件对哪些字段做了更改。
第一个是Pod被创建的事件,在managedFields这个数组中有一个元素,它的manager是kubectl run。
第二个是Pod被更改的事件,在这个事件中,managedFields依然只有一个元素且内容并没有发生改变(这里有点奇怪),但是我们发现Pod的nodeName字段以及scheduler字段已经有了值,说明该事件是Pod被调度成功的事件。
第三个事件中,managedFields多了一个元素,它的manager是Go-http-client,它改变的字段主要是status。然后我们看一下Pod的这些字段,发现多了status.conditions和status.containerStatus等字段。从这些字段的值可以发现,Pod此时还没有处于Running状态,也就是说,kubelet在监听到调度到它的Pod后,首先会更改这个Pod的状态。
第四个事件中,managedFields又多了一个元素,它的manager是calico,通过观察我们知道该事件表示是calico给Pod设置了IP,以及在annotation中添加了一些内容。
第五个事件中,managedFields的元素还是三个,不过manager为Go-http-client的内容已经变了(time已经变了),通过观察我们发现,此时Pod已经是Running状态。
通过上面的分析,我们可以简单总结一下Pod的创建过程为:
```
创建 -> 调度 -> kubelet初始化 -> 分配IP -> 运行
```
在上面的分析中,依然不知道clusterloader2中的watch阶段是什么意思。不急,我们再来看clusterloader2的源码。
clusterloader2以一个二层map来存储每个Pod的每个阶段的超始时间,比如`map["default/test"]["create"]`存储的就是default命名空间下test这个Pod的创建时间。
接下来,我们来看一下,clusterloader2是如何获取每个Pod的每个阶段的超始时间的。下面这个函数为关键函数,clusterloader2会监听测试过程中指定label的Pod,当监听到指定的Pod的事件后,便会调用下面的这个函数,该函数本人已经添加了中文注释:
```
func (p *podStartupLatencyMeasurement) processEvent(event *eventData) {
// obj就是上面事件中的object对象,recvTime表示clusterloader2接收到这个事件的时间
obj, recvTime := event.obj, event.recvTime
if obj == nil {
return
}
pod, ok := obj.(*corev1.Pod)
if !ok {
return
}
// 根据namespace与pod的名字生成生成key,类似 namespace/pod,可定位到某一个Pod
key := createMetaNamespaceKey(pod.Namespace, pod.Name)
p.podMetadata.SetStateless(key, isPodStateless(pod))
// 只有当这个事件中,Pod的Phase为Running时,才处理这个事件,可以减少事件的处理
// 创建、调度、初始化等事件都不需要处理,这是因为在运行这个事件中,可以拿到Pod的创建、运行时间
if pod.Status.Phase == corev1.PodRunning {
// 如果没有找到该Pod的记录,则开始统计;如果该Pod已经统计过,则忽略该事件无需再统计
if _, found := p.podStartupEntries.Get(key, createPhase); !found {
// watch的时间就是clusterloader2收到该事件的时间
p.podStartupEntries.Set(key, watchPhase, recvTime)
// 从Pod的metadata.creationTimestamp获取Pod的创建时间
p.podStartupEntries.Set(key, createPhase, pod.CreationTimestamp.Time)
var startTime metav1.Time
// 从Pod的.status.containerStatuses字段中遍历所有container的启动时间,最后一个Container的启动时间作为该Pod的启动时间
for _, cs := range pod.Status.ContainerStatuses {
if cs.State.Running != nil {
if startTime.Before(&cs.State.Running.StartedAt) {
startTime = cs.State.Running.StartedAt
}
}
}
if startTime != metav1.NewTime(time.Time{}) {
p.podStartupEntries.Set(key, runPhase, startTime.Time)
} else {
klog.Errorf("%s: pod %v (%v) is reported to be running, but none of its containers is", p, pod.Name, pod.Namespace)
}
}
}
}
```
通过上面的函数分析,我们基本清楚了clusterloader2是如何获取Pod的各个阶段的起始时间的。我们也终于弄清楚了:**Pod的watch阶段的超始时间就是clusterloader2第一次接收到Pod为Running状态的事件的时间**。为什么是第一次为Running状态的事件的时间呢?这是因为如果Pod的container重启,kubelet会修改Pod的相关字段,此时kube-apiserver又会给clusterloader2发送一个事件,这个事件中Pod还会是Running状态。
上面的函数已经统计了一个Pod的create、run、watch三个阶段的起始时间,但是没有看到schedule阶段的超始时间。别急,Pod的调度时间是在下面的函数中进行统计:
```
func (p *podStartupLatencyMeasurement) gatherScheduleTimes(c clientset.Interface) error {
// 通过这两个字段,过滤出调度的event,注意这里的event就是kubectl get event中的event,而不是clusterloader2监听到的事件
selector := fields.Set{
"involvedObject.kind": "Pod",
"source": corev1.DefaultSchedulerName,
}.AsSelector().String()
options := metav1.ListOptions{FieldSelector: selector}
schedEvents, err := c.CoreV1().Events(p.selector.Namespace).List(context.TODO(), options)
if err != nil {
return err
}
// 从event对象中获取Pod的调度成功时间
for _, event := range schedEvents.Items {
key := createMetaNamespaceKey(event.InvolvedObject.Namespace, event.InvolvedObject.Name)
if _, exists := p.podStartupEntries.Get(key, createPhase); exists {
if !event.EventTime.IsZero() { // 如果.eventTime非空,则用它作为调度成功的时间
p.podStartupEntries.Set(key, schedulePhase, event.EventTime.Time)
} else { // 如果.eventTime为空,则使用.firstTimestamp作为调度成功的时间
p.podStartupEntries.Set(key, schedulePhase, event.FirstTimestamp.Time)
}
}
}
return nil
}
```
可以看到,该函数首先通过fieldSelector过滤出所有Pod的调度事件,然后再从每个event对象的特定字段中获取Pod的调度成功时间。我们可以通过下面的命令过滤出上述函数中的event:
```
$ kubectl get events --all-namespaces -o wide --field-selector involvedObject.kind=Pod,source=default-scheduler
NAMESPACE LAST SEEN TYPE REASON OBJECT SUBOBJECT SOURCE MESSAGE FIRST SEEN COUNT NAME
default 17m Normal Scheduled pod/test default-scheduler Successfully assigned default/test to 10.35.20.5 17m 1 test.1710a798130957bd
```
我们来看一下default/test这个Pod的调度事件的完整yaml内容(注意这个事件的时间和上面create、run的时间可能相差很远,这是因为本文写到这里时,default/test已经被重复创建和删除了好几次了):
```
$ kubectl get event test.1710a798130957bd -o yaml
apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2022-09-01T06:08:54Z"
involvedObject:
apiVersion: v1
kind: Pod
name: test
namespace: default
resourceVersion: "3016347"
uid: 258f859a-497d-4cb3-aa0b-1b4029fd51a8
kind: Event
lastTimestamp: "2022-09-01T06:08:54Z"
message: Successfully assigned default/test to 10.35.20.5
metadata:
creationTimestamp: "2022-09-01T06:08:54Z"
name: test.1710a798130957bd
namespace: default
resourceVersion: "3016349"
uid: fd2cc914-6a71-4579-9552-2fd7ac4c00c7
reason: Scheduled
reportingComponent: ""
reportingInstance: ""
source:
component: default-scheduler
type: Normal
```
可以看到,involvedObject字段中的值表示这个event就是default/test这个Pod的调度成功后,default-scheduler生成的。这里有几个问题:
第一,在前面的函数中,为什么先以eventTime为准,如果它为空则取firstTimestamp?
这个问题没有去深入了解,可以阅读kube-scheduler的源码或者做实验去研究一下(猜测可能eventTime是调度成功的时间,firstTimestamp是第一次调度的时间,但第一次调度不一定成功)。
第二,统计schedule时间是在gather阶段,也就是说clusterloader2要等所有N*30个Pod启动成功后才会一次性统计它的schedule时间;但是event在etcd只默认只保存一个小时,如果还没有等所有的Pod都启动失败,一部分Pod的event被自动删除了怎么办?
对于这个问题,好像clusterloader2也没有管,如果event被自动删除了,就忽略这个Pod的schedule的时间。
### **Clusterloader2中的统计方法**
- 常用命令
- 安装
- 安装Kubeadm
- 安装单Master集群
- 安装高可用集群(手动分发证书)
- 安装高可用集群(自动分发证书)
- 启动参数解析
- certificate-key
- ETCD相关参数
- Kubernetes端口汇总
- 安装IPv4-IPv6双栈集群
- 下载二进制文件
- 使用Kata容器
- 快速安装shell脚本
- 存储
- 实践
- Ceph-RBD实践
- CephFS实践
- 对象存储
- 阿里云CSI
- CSI
- 安全
- 认证与授权
- 认证
- 认证-实践
- 授权
- ServiceAccount
- NodeAuthorizor
- TLS bootstrapping
- Kubelet的认证
- 准入控制
- 准入控制示例
- Pod安全上下文
- Selinux-Seccomp-Capabilities
- 给容器配置安全上下文
- PodSecurityPolicy
- K8S-1.8手动开启认证与授权
- Helm
- Helm命令
- Chart
- 快速入门
- 内置对象
- 模板函数与管道
- 模板函数列表
- 流程控制
- Chart依赖
- Repository
- 开源的Chart包
- CRD
- CRD入门
- 工作负载
- Pod
- Pod的重启策略
- Container
- 探针
- 工作负载的状态
- 有状态服务
- 网络插件
- Multus
- Calico+Flannel
- 容器网络限速
- 自研网络插件
- 设计文档
- Cilium
- 安装Cilium
- Calico
- Calico-FAQ
- IPAM
- Whereabouts
- 控制平面与Pod网络分开
- 重新编译
- 编译kubeadm
- 编译kubeadm-1.23
- 资源预留
- 资源预留简介
- imagefs与nodefs
- 资源预留 vs 驱逐 vs OOM
- 负载均衡
- 灰度与蓝绿
- Ingress的TLS
- 多个NginxIngressController实例
- Service的会话亲和
- CNI实践
- CNI规范
- 使用cnitool模拟调用
- CNI快速入门
- 性能测试
- 性能测试简介
- 制作kubemark镜像
- 使用clusterloader2进行性能测试
- 编译clusterloader2二进制文件
- 搭建性能测试环境
- 运行density测试
- 运行load测试
- 参数调优
- Measurement
- TestMetrics
- EtcdMetrics
- SLOMeasurement
- PrometheusMeasurement
- APIResponsivenessPrometheus
- PodStartupLatency
- FAQ
- 调度
- 亲和性与反亲和性
- GPU
- HPA
- 命名规范
- 可信云认证
- 磁盘限速
- Virtual-kubelet
- VK思路整理
- Kubebuilder
- FAQ
- 阿里云日志服务SLS