前言 本篇文章涉及很多 Kubernetes 基础知识,包括但不限于以下资源对象:
Namespace
Pod
Deployment
Service
Volume
StatefulSet
DaemonSet
Ingress
Role
如果这些东西你还很陌生,请转战Kubernetes中文官方文档
基础设施 安装工具 1 yum install bind-utils -y
内网免密登录
master 节点的 pub key 加入各节点的 auth key
初始化集群 为了节省时间使用了 sealos 工具来搭建 K8S 集群,如果你想了解使用 Kubeadm 部署 K8S 集群可以参考笔者的另一篇文章Kubeadm 安装 Kubernetes V1.22.2 踩坑手记
工具 1 2 wget -c https://sealyun-home.oss-cn-beijing.aliyuncs.com/sealos/latest/sealos && \ chmod +x sealos && mv sealos /usr/bin
资源包 1 2 wget -c https://sealyun.oss-cn-beijing.aliyuncs.com/05a3db657821277f5f3b92d834bbaf98-v1.22.0/kube1.22.0.tar.gz
init 1 2 3 4 5 sealos init \ --user --pk /root/.ssh/id_rsa \ --master 172.16.168.2 \ --node 172.16.168.3 \ --pkg-url /root/kube1.22.0.tar.gz \ --version v1.22.0
命令行工具 kubectl 补全 1 2 3 4 5 6 yum install bash-completion -y source /usr/share/bash-completion/bash_completionecho 'source <(kubectl completion bash)' >>~/.bashrc
helm 1 curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
安装 traefik 1 2 3 4 5 git clone https://github.com/traefik/traefik-helm-chart vim values-custom.yaml
自定义 value 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ingressRoute: dashboard: enabled: false ports: web: port: 8000 nodePort: 31800 websecure: port: 8443 nodePort: 31443 service: enabled: true type: NodePort logs: general: level: DEBUG
1 2 3 helm install -n traefik traefik ./traefik/ -f ./values-custom.yaml
dashboard 资源文件 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: dashboard namespace: traefik spec: entryPoints: - web routes: - match: (PathPrefix(`/dashboard`) || PathPrefix(`/api`)) kind: Rule services: - name: api@internal kind: TraefikService
部署 Nginx 服务 生成资源文件 1 2 3 4 kubectl create deployment nginx --image=nginx --replicas=3 --port=80 -o=yaml --dry-run > deployment.yaml kubectl create svc clusterip nginx --tcp=80:80 -o=yaml --dry-run > service.yaml
使用 traefik 转发 nginx 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: nginx spec: entryPoints: - web routes: - match: PathPrefix(`/nginx`) kind: Rule services: - name: nginx port: 80 middlewares: - name: stripprefix namespace: default --- apiVersion: traefik.containo.us/v1alpha1 kind: Middleware metadata: name: stripprefix namespace: default spec: stripPrefix: prefixes: - /nginx
安装 EFK 部署 NFS 服务 因为 ES 需要部署有状态服务,存储需要共享磁盘,所以选用 NFS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 yum install nfs-utils systemctl enable rpcbind systemctl enable nfs systemctl start rpcbind systemctl start nfs mkdir /datachmod 755 /datavi /etc/exports systemctl restart nfs
配置说明
/data: 共享目录位置。
192.168.0.0/24: 客户端 IP 范围,* 代表所有,即没有限制。
rw: 权限设置,可读可写。
sync: 同步共享目录。
no_root_squash: 可以使用 root 授权。
no_all_squash: 可以使用普通用户授权。
客户端安装 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 yum install nfs-utils systemctl enable rpcbind systemctl start rpcbind showmount -e 172.16.168.3 mkdir /datamount -t nfs 172.16.168.3:/data /data mount
部署 NFS provisioner RBAC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 apiVersion: v1 kind: ServiceAccount metadata: name: nfs-client-provisioner namespace: loging --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: nfs-client-provisioner-runner rules: - apiGroups: ["" ] resources: ["nodes" ] verbs: ["get" , "list" , "watch" ] - apiGroups: ["" ] resources: ["persistentvolumes" ] verbs: ["get" , "list" , "watch" , "create" , "delete" ] - apiGroups: ["" ] resources: ["persistentvolumeclaims" ] verbs: ["get" , "list" , "watch" , "update" ] - apiGroups: ["storage.k8s.io" ] resources: ["storageclasses" ] verbs: ["get" , "list" , "watch" ] - apiGroups: ["" ] resources: ["events" ] verbs: ["create" , "update" , "patch" ] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: run-nfs-client-provisioner subjects: - kind: ServiceAccount name: nfs-client-provisioner namespace: loging roleRef: kind: ClusterRole name: nfs-client-provisioner-runner apiGroup: rbac.authorization.k8s.io --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner namespace: loging rules: - apiGroups: ["" ] resources: ["endpoints" ] verbs: ["get" , "list" , "watch" , "create" , "update" , "patch" ] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: leader-locking-nfs-client-provisioner namespace: loging subjects: - kind: ServiceAccount name: nfs-client-provisioner namespace: loging roleRef: kind: Role name: leader-locking-nfs-client-provisioner apiGroup: rbac.authorization.k8s.io
Deployment 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 apiVersion: apps/v1 kind: Deployment metadata: name: nfs-client-provisioner labels: app: nfs-client-provisioner namespace: logging spec: replicas: 1 strategy: type: Recreate selector: matchLabels: app: nfs-client-provisioner template: metadata: labels: app: nfs-client-provisioner spec: serviceAccountName: nfs-client-provisioner containers: - name: nfs-client-provisioner image: lank8s.cn/sig-storage/nfs-subdir-external-provisioner:v4.0.2 volumeMounts: - name: nfs-client-root mountPath: /persistentvolumes env: - name: PROVISIONER_NAME value: k8s-sigs.io/nfs-subdir-external-provisioner - name: NFS_SERVER value: 172.16 .168 .3 - name: NFS_PATH value: /data volumes: - name: nfs-client-root nfs: server: 172.16 .168 .3 path: /data
StorageClass 1 2 3 4 5 6 7 8 9 apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nfs-client namespace: logging provisioner: k8s-sigs.io/nfs-subdir-external-provisioner parameters: archiveOnDelete: "false"
部署 ES 和 Kibana ES 有状态 POD(StatefulSet) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 apiVersion: apps/v1 kind: StatefulSet metadata: namespace: logging name: es-cluster spec: serviceName: elasticsearch replicas: 2 selector: matchLabels: app: elasticsearch template: metadata: labels: app: elasticsearch spec: containers: - name: elasticsearch image: elasticsearch:7.5.0 resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 9200 name: rest protocol: TCP - containerPort: 9300 name: inter-node protocol: TCP volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data env: - name: cluster.name value: k8s-logs - name: node.name valueFrom: fieldRef: fieldPath: metadata.name - name: discovery.seed_hosts value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch" - name: cluster.initial_master_nodes value: "es-cluster-0,es-cluster-1" - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" initContainers: - name: fix-permissions image: busybox command: ["sh" , "-c" , "chown -R 1000:1000 /usr/share/elasticsearch/data" ] securityContext: privileged: true volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data - name: increase-vm-max-map image: busybox command: ["sysctl" , "-w" , "vm.max_map_count=262144" ] securityContext: privileged: true - name: increase-fd-ulimit image: busybox command: ["sh" , "-c" , "ulimit -n 65536" ] securityContext: privileged: true volumeClaimTemplates: - metadata: name: data labels: app: elasticsearch spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "nfs-client" resources: requests: storage: 10Gi
无头服务(Headless Service) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 apiVersion: v1 kind: Service metadata: namespace: logging name: elasticsearch labels: app: elasticsearch spec: selector: app: elasticsearch clusterIP: None ports: - port: 9200 name: rest - port: 9300 name: inter-node
Kibana 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 apiVersion: apps/v1 kind: Deployment metadata: name: kibana namespace: logging labels: app: kibana spec: replicas: 1 selector: matchLabels: app: kibana template: metadata: labels: app: kibana spec: containers: - name: kibana image: kibana:7.5.0 resources: limits: cpu: 1000m requests: cpu: 100m env: - name: ELASTICSEARCH_URL value: http://elasticsearch:9200 ports: - containerPort: 5601 --- apiVersion: v1 kind: Service metadata: name: kibana namespace: logging spec: selector: app: kibana type: NodePort ports: - port: 8080 targetPort: 5601 nodePort: 30000
使用 helm 部署 fluentd 1 2 3 4 5 6 7 helm repo add fluent https://fluent.github.io/helm-charts helm search repo fluent helm pull fluent/fluentd
然后解压下载来的压缩包,得到如下目录
1 2 3 4 5 6 fluentd ├── Chart.yaml ├── dashboards ├── README.md ├── templates └── values.yaml
拷贝一份 values.yaml 用来自定义相关配置
1 cp values.yaml values-custom.yaml
编辑 values-custom.yaml,以下为修改的内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 fileConfigs: 01_sources.conf: |- <source> @type tail @id in_tail_container_logs @label @KUBERNETES path /var/log/containers/*.log pos_file /var/log/fluentd-containers.log.pos tag kubernetes.* read_from_head true <parse> @type cri </parse> emit_unmatched_lines true </source> 04_outputs.conf: |- <label @OUTPUT> <match **> @type elasticsearch host "elasticsearch.logging.svc.cluster.local" port 9200 path "" user elastic password changeme </match> </label>
安装
1 2 3 4 helm install fluentd fluentd/ -f fluentd/values-c.yaml -n logging helm uninstall fluentd -n logging
Prometheus + Grafana 监控 使用 Helm 安装 Prometheus Get Repo Info
1 2 helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update
下载 Charts,自定义 Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 helm pull prometheus-community/prometheus prometheus/ ├── Chart.lock ├── charts │ └── kube-state-metrics ├── Chart.yaml ├── README.md ├── templates │ ├── alertmanager │ ├── _helpers.tpl │ ├── node-exporter │ ├── NOTES.txt │ ├── pushgateway │ └── server └── values.yaml cp values.yaml values-custom.yaml
修改之后的 values-custom.yaml 完整内容参考:https://github.com/linganmin/charts-custom-value/blob/master/prometheus/custom-values.yaml ,下面只列出做了修改的地方
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 image: repository: quay.io/prometheus/alertmanager tag: v0.23.0 pullPolicy: IfNotPresent storageClass: "nfs-client" name: configmap-reload image: repository: registry.cn-hangzhou.aliyuncs.com/lanni-base/configmap-reload tag: latest pullPolicy: IfNotPresent kubeStateMetrics: enabled: false name: node-exporter image: repository: quay.io/prometheus/node-exporter tag: v1.3.0 pullPolicy: IfNotPresent tolerations: - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule" image: repository: quay.io/prometheus/prometheus tag: v2.34.0 pullPolicy: IfNotPresent
安装 1 helm install prometheus ./prometheus -f prometheus/values-custom.yaml -n monitoring
使用 Helm 安装 Grafana TODO
同步系统时间 ntpdate time.windows.cn
hwclock –systohc
代理 替换 k8s.gcr.io lank8s.cn替换k8s.gcr.io
Ref https://blog.51cto.com/u_11734401/4286237#efk%E6%97%A5%E5%BF%97%E7%B3%BB%E7%BB%9F https://devopscube.com/setup-efk-stack-on-kubernetes/
https://qizhanming.com/blog/2018/08/08/how-to-install-nfs-on-centos-7 https://www.cnblogs.com/panwenbin-logs/p/12196286.html
Kubernetes(k8s)helm 搭建 prometheus + Grafana 监控