一、背景
1.1prometheus告警
1.2查看pod状态
内存use大于request的初始分配值,达到request的268%
1.3配置对应的request内存值
resources:
limits:
cpu: "2000m"
memory: "8096Mi"
requests:
cpu: "600m"
memory: "4096Mi"
二、helm升级
[k8s-prod-de@ip-172-20-21-242 elasticsearch]$ helm upgrade -i fastbull-es -n project-fastbull .
三、升级后有状态pod无法生效
StatefulSet控制器yaml的内容已经生效了,但是无法滚动更新pod
四、rancher进行扩缩容部署
由于pod无法滚动更新,尝试使用扩缩容模式进行滚动更新pod
五、pod状态转变为Terminating
最不想看到的事情还是发生了
日志显示:
[2021-04-22T11:30:05,694][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] added {{fastbull-es-elasticsearch-client-5f85bf996-zc9gz}{H1F31fi-QTaQgjMSkcH64g}{-JChY3w9QGeY82nocNMnYQ}{172.23.3.197}{172.23.3.197:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [108]])
[2021-04-22T11:30:10,010][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] removed {{fastbull-es-elasticsearch-client-7bb9f85876-6xzqp}{EB1SJ3x1SX6RrgLHYSYZGQ}{y6lgZ4PsThWT3KOs9W0LEg}{172.23.5.251}{172.23.5.251:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [109]])
[2021-04-22T11:30:59,407][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] added {{fastbull-es-elasticsearch-client-5f85bf996-9bzxh}{4pfMtRecRgmAjIxNPrPfZA}{Uj1SFLMjRc2tFpbGrbm_iA}{172.23.4.164}{172.23.4.164:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [110]])
[2021-04-22T11:31:05,262][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] removed {{fastbull-es-elasticsearch-client-7bb9f85876-n2srr}{2Yp0QBCWSt-86zO8moRG3w}{m9hTfCaQSx-lFwYadV_84Q}{172.23.4.115}{172.23.4.115:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [111]])
#理解:可能是它在删除前会删除关联fastbull-es-elasticsearch-client-xx的pod,而该pod又被我用helm滚动更新过了,所以说fastbull-es-elasticsearch-client-xx这个pod已经是不存在的
六、强制删除pod以释放其名称
[k8s-prod-de@ip-172-20-21-242 ~]$ kubectl delete pods fastbull-es-elasticsearch-data-1 -n project-fastbull --grace-period=0 --force
七、后续一路坑
当把pod升级成功后发现程序连接异常了,查看集群发现节点都没错(比对了集群7个pod节点)
八、处理
将es-master也重新发布了一次,应用pod启动成功
留言