一、背景

1.1prometheus告警

1.2查看pod状态

内存use大于request的初始分配值,达到request的268%

1.3配置对应的request内存值
  resources:
    limits:
      cpu: "2000m"
      memory: "8096Mi"
    requests:
      cpu: "600m"
      memory: "4096Mi"

二、helm升级

[k8s-prod-de@ip-172-20-21-242 elasticsearch]$ helm upgrade -i fastbull-es -n project-fastbull .

三、升级后有状态pod无法生效

StatefulSet控制器yaml的内容已经生效了,但是无法滚动更新pod

四、rancher进行扩缩容部署

由于pod无法滚动更新,尝试使用扩缩容模式进行滚动更新pod

五、pod状态转变为Terminating

最不想看到的事情还是发生了

日志显示:

[2021-04-22T11:30:05,694][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] added {{fastbull-es-elasticsearch-client-5f85bf996-zc9gz}{H1F31fi-QTaQgjMSkcH64g}{-JChY3w9QGeY82nocNMnYQ}{172.23.3.197}{172.23.3.197:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [108]])

[2021-04-22T11:30:10,010][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] removed {{fastbull-es-elasticsearch-client-7bb9f85876-6xzqp}{EB1SJ3x1SX6RrgLHYSYZGQ}{y6lgZ4PsThWT3KOs9W0LEg}{172.23.5.251}{172.23.5.251:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [109]])

[2021-04-22T11:30:59,407][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] added {{fastbull-es-elasticsearch-client-5f85bf996-9bzxh}{4pfMtRecRgmAjIxNPrPfZA}{Uj1SFLMjRc2tFpbGrbm_iA}{172.23.4.164}{172.23.4.164:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [110]])

[2021-04-22T11:31:05,262][INFO ][o.e.c.s.ClusterApplierService] [fastbull-es-elasticsearch-data-1] removed {{fastbull-es-elasticsearch-client-7bb9f85876-n2srr}{2Yp0QBCWSt-86zO8moRG3w}{m9hTfCaQSx-lFwYadV_84Q}{172.23.4.115}{172.23.4.115:9300},}, reason: apply cluster state (from master [master {fastbull-es-elasticsearch-master-0}{6x_s5h7BRSOTmPnTrcoTbw}{KBHVrMjuRb2hyle98GFyrA}{172.23.5.102}{172.23.5.102:9300} committed version [111]])

#理解:可能是它在删除前会删除关联fastbull-es-elasticsearch-client-xx的pod,而该pod又被我用helm滚动更新过了,所以说fastbull-es-elasticsearch-client-xx这个pod已经是不存在的

六、强制删除pod以释放其名称

[k8s-prod-de@ip-172-20-21-242 ~]$ kubectl delete pods fastbull-es-elasticsearch-data-1 -n project-fastbull --grace-period=0 --force

七、后续一路坑

当把pod升级成功后发现程序连接异常了,查看集群发现节点都没错(比对了集群7个pod节点)

八、处理

将es-master也重新发布了一次,应用pod启动成功

最后修改日期: 2021年4月22日

作者

留言

撰写回覆或留言

发布留言必须填写的电子邮件地址不会公开。