一、概述
使用 helm 安装 Prometheus Operator。使用 helm 安装后,会在 Kubernetes 集群中创建、配置和管理 Prometheus 集群,chart 中包含多种组件:
- prometheus-operator
- prometheus
- alertmanager
- node-exporter
- kube-state-metrics
- grafana
收集 Kubernetes 内部组件指标的监控服务:
- kube-apiserver
- kube-scheduler
- kube-controller-manager
- etcd
- kube-dns/coredns
- kube-proxy
二、架构
架构图如下:
三、安装
3.1部署
[root@xiangys0134-k8s-master prometheus]# helm search prometheus-operator
[root@xiangys0134-k8s-master prometheus]# helm install --name prometheus-operator --set rbacEnable=true --namespace=monitoring stable/prometheus-operator
3.2插件pod状态
[root@xiangys0134-k8s-master prometheus]# kubectl get pods -n monitoring
[root@xiangys0134-k8s-master prometheus]# kubectl get svc -n monitoring
四、通过 Ingress 来暴漏服务
4.1查看ingress控制器
[root@xiangys0134-k8s-master prometheus]# kubectl get pod -n ingress-nginx
NAME READY STATUS RESTARTS AGE
nginx-ingress-controller-5bb8fb4bb6-mbnnz 1/1 Running 2 14d
4.2配置ingress规则
[root@xiangys0134-k8s-master prometheus]# vi prometheus-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: monitoring
name: prometheus-ingress
spec:
rules:
- host: grafana.domain.com
http:
paths:
- backend:
serviceName: prometheus-operator-grafana
servicePort: 80
- host: prometheus.domain.com
http:
paths:
- backend:
serviceName: prometheus-operator-prometheus
servicePort: 9090
- host: alertmanager.domain.com
http:
paths:
- backend:
serviceName: prometheus-operator-alertmanager
servicePort: 9093
[root@xiangys0134-k8s-master prometheus]# kubectl apply -f prometheus-ingress.yaml
[root@xiangys0134-k8s-master prometheus]# kubectl get ingress -n monitoring
4.3修改hosts
192.168.10.116 grafana.domain.com
192.168.10.116 prometheus.domain.com
192.168.10.116 alertmanager.domain.com
访问以下几个网站是否正常
http://grafana.domain.com:30080 admin/prom-operator
http://prometheus.domain.com:30080
http://alertmanager.domain.com:30080
五、修改k8s配置
Prometheus Operator某些图表没有数据,需要修改配置文件才行。prometheus 通过 4001 端口访问 etcd metrics,但是 etcd 默认监听 2379。解决方法是在 /etc/kubernetes/manifests/etcd.yaml
六、监控列表
node监控
pod监控
其实差不多也够了,之后将告警配置起来其实就足够了
留言