【云计算】Kubernetes入门与实践:从部署到运维
引言
Kubernetes(简称K8s)作为容器编排领域的标杆技术,已经成为现代云原生应用部署的事实标准。它源自Google内部的Borg系统,经过多年的生产环境验证,于2015年开源并捐赠给CNCF(云原生计算基金会)。本文将全面介绍Kubernetes的核心概念、架构设计、核心资源对象、资源调度机制以及运维实践,帮助读者从零基础到能够独立完成生产环境的部署和运维工作。
一、Kubernetes概述
1.1 什么是Kubernetes
Kubernetes是一个开源的容器编排平台,用于自动化容器化应用的部署、扩展和管理。其核心特性包括:
- 自我修复:自动重启失败的容器,替换和重新调度不可用的节点
- 水平扩展:通过对Deployment的简单命令或基于CPU使用率的自动扩展
- 服务发现与负载均衡:为容器提供稳定的网络标识和流量分发
- 自动装箱:根据资源需求自动放置容器到合适的节点
- 配置管理与密钥管理:管理敏感信息和配置,避免泄漏到镜像中
- 存储编排:自动挂载存储系统,如本地存储、NFS、云存储等
1.2 Kubernetes架构
┌─────────────────────────────────────────────────────────────────┐ │ Kubernetes Cluster │ │ │ │ ┌──────────────────┐ │ │ │ Control Plane │ │ │ │ ┌────────────┐ │ │ │ │ │ API Server │ │ │ │ │ └────────────┘ │ │ │ │ ┌────────────┐ │ ┌────────────┐ ┌────────────┐ │ │ │ │ Scheduler │ │ │ Controller │ │ etcd │ │ │ │ │ │ │ │ Manager │ │ │ │ │ │ └────────────┘ │ └────────────┘ └────────────┘ │ │ └──────────────────┘ │ │ │ │ │ ┌────────┴────────────────────────────────────────────────┐ │ │ │ Data Plane │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Node 1 │ │ Node 2 │ │ Node 3 │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ kubelet │ │ │ │ kubelet │ │ │ │ kubelet │ │ │ │ │ │ │ │kube-proxy│ │ │ │kube-proxy│ │ │ │kube-proxy│ │ │ │ │ │ │ └────┬────┘ │ │ └────┬────┘ │ │ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌────▼────┐ │ │ ┌────▼────┐ │ │ ┌────▼────┐ │ │ │ │ │ │ │Container│ │ │ │Container│ │ │ │Container│ │ │ │ │ │ │ │ Runtime │ │ │ │ Runtime │ │ │ │ Runtime │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘1.3 核心组件详解
Control Plane(控制平面)组件:
- kube-apiserver:集群的统一入口,处理所有RESTful API请求
- etcd:高可用的键值存储,保存集群所有状态数据
- kube-scheduler:负责Pod调度,将Pod分配到合适的节点
- kube-controller-manager:运行各种控制器,确保集群期望状态
Node(工作节点)组件:
- kubelet:节点代理,负责管理容器生命周期
- kube-proxy:网络代理,维护网络规则
- Container Runtime:容器运行时(Docker/containerd)
二、核心资源对象
2.1 Pod - 最小调度单元
# Pod基本定义 apiVersion: v1 kind: Pod metadata: name: nginx-pod labels: app: nginx environment: production spec: containers: - name: nginx image: nginx:1.24 ports: - containerPort: 80 name: http protocol: TCP - containerPort: 443 name: https protocol: TCP resources: requests: memory: "128Mi" cpu: "250m" limits: memory: "256Mi" cpu: "500m" livenessProbe: httpGet: path: /healthz port: 80 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 80 initialDelaySeconds: 5 periodSeconds: 5 env: - name: NGINX_HOST value: "localhost" - name: NGINX_PORT value: "80"# 多容器Pod - Sidecar模式 apiVersion: v1 kind: Pod metadata: name: web-app-with-log-collector labels: app: web-app spec: containers: # 主应用容器 - name: web-app image: myapp:latest ports: - containerPort: 8080 volumeMounts: - name: shared-logs mountPath: /var/log/app # Sidecar日志收集容器 - name: log-collector image: fluent/fluent-bit:latest volumeMounts: - name: shared-logs mountPath: /var/log/app - name: fluentd-config mountPath: /fluentd/etc env: - name: FLUENTD_CONF value: "app.conf" # Sidecar代理容器 - name: envoy-proxy image: envoyproxy/envoy:v1.20 ports: - containerPort: 15001 env: - name: ENVOY_EDGE_STATS value: "true" volumes: - name: shared-logs emptyDir: {} - name: fluentd-config configMap: name: fluentd-config2.2 ReplicaSet与Deployment
# ReplicaSet定义 apiVersion: apps/v1 kind: ReplicaSet metadata: name: nginx-replicaset labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.24 ports: - containerPort: 80# Deployment定义 - 生产环境推荐 apiVersion: apps/v1 kind: Deployment metadata: name: web-deployment labels: app: web-application spec: replicas: 5 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: web-application template: metadata: labels: app: web-application version: v1.0.0 spec: terminationGracePeriodSeconds: 30 containers: - name: web-app image: myorg/web-app:v1.0.0 ports: - containerPort: 8080 name: http - containerPort: 8443 name: https resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "500m" env: - name: DATABASE_URL valueFrom: secretKeyRef: name: app-secrets key: database-url - name: REDIS_HOST value: "redis-service" - name: LOG_LEVEL valueFrom: configMapKeyRef: name: app-config key: log-level livenessProbe: httpGet: path: /health/live port: 8080 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /health/ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 successThreshold: 1 failureThreshold: 3 lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 10"]2.3 Service与Ingress
# ClusterIP Service - 内部访问 apiVersion: v1 kind: Service metadata: name: backend-service labels: app: backend spec: type: ClusterIP selector: app: backend ports: - name: http port: 80 targetPort: 8080 protocol: TCP - name: grpc port: 50051 targetPort: 50051 protocol: TCP# NodePort Service - 节点端口访问 apiVersion: v1 kind: Service metadata: name: frontend-service spec: type: NodePort selector: app: frontend ports: - name: http port: 80 targetPort: 3000 nodePort: 30080 - name: https port: 443 targetPort: 3001 nodePort: 30443# LoadBalancer Service - 云厂商负载均衡器 apiVersion: v1 kind: Service metadata: name: web-service annotations: # AWS ALB annotations service.beta.kubernetes.io/aws-load-balancer-type: "nlb" service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http" service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:xxx" spec: type: LoadBalancer selector: app: web ports: - name: https port: 443 targetPort: 8080# Ingress - HTTP/HTTPS入口 apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: web-ingress annotations: kubernetes.io/ingress.class: "nginx" nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/force-ssl-redirect: "true" nginx.ingress.kubernetes.io/proxy-body-size: "50m" nginx.ingress.kubernetes.io/proxy-connect-timeout: "30" nginx.ingress.kubernetes.io/proxy-read-timeout: "60" nginx.ingress.kubernetes.io/proxy-send-timeout: "60" spec: tls: - hosts: - www.example.com - api.example.com secretName: tls-secret rules: - host: www.example.com http: paths: - path: / pathType: Prefix backend: service: name: web-frontend port: number: 80 - path: /api pathType: Prefix backend: service: name: api-gateway port: number: 8080 - host: api.example.com http: paths: - path: / pathType: Prefix backend: service: name: api-service port: number: 80802.4 ConfigMap与Secret
# ConfigMap - 应用配置 apiVersion: v1 kind: ConfigMap metadata: name: app-config data: # Properties格式 database.properties: | db.host=postgres-service db.port=5432 db.name=appdb db.pool.size=20 # JSON格式 config.json: | { "logLevel": "info", "features": { "newUI": true, "betaAPI": false }, "rateLimit": { "requests": 100, "window": "1m" } }# Secret - 敏感数据 apiVersion: v1 kind: Secret metadata: name: app-secrets type: Opaque data: # Base64编码的值 # echo -n "password123" | base64 db-password: cGFzc3dvcmQxMjM= api-key: c29tZS1hcGkta2V5LWJhc2U2NC1lbmNvZGVk tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0t... tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0t... stringData: # 纯文本格式,会自动Base64编码 username: admin2.5 PersistentVolume与PersistentVolumeClaim
# PersistentVolume - NFS存储 apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv labels: type: nfs spec: capacity: storage: 100Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain mountOptions: - hard - nfsvers=4.1 nfs: server: nfs-server.example.com path: /exported/path --- # PersistentVolumeClaim apiVersion: v1 kind: PersistentVolumeClaim metadata: name: app-storage spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi selector: matchLabels: type: nfs --- # Pod使用PVC apiVersion: v1 kind: Pod metadata: name: app-with-storage spec: containers: - name: app image: myapp:latest volumeMounts: - name: app-data mountPath: /data volumes: - name: app-data persistentVolumeClaim: claimName: app-storage三、核心概念详解
3.1 命名空间
# 命名空间定义 apiVersion: v1 kind: Namespace metadata: name: production labels: environment: production team: platform --- # 使用命名空间的资源 apiVersion: apps/v1 kind: Deployment metadata: name: web-deployment namespace: production spec: replicas: 3 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: web image: myorg/web:v1# 命名空间操作 kubectl get namespaces kubectl create namespace staging kubectl delete namespace unused-namespace # 查看特定命名空间的资源 kubectl get pods -n production kubectl get all -n production # 设置默认命名空间 kubectl config set-context --current --namespace=production3.2 标签与选择器
# 资源标签示例 apiVersion: apps/v1 kind: Deployment metadata: name: api-deployment labels: app: api version: v2.1.0 tier: backend environment: production team: backend managed-by: kubectl spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: api version: v2.1.0 tier: backend environment: production spec: containers: - name: api image: myorg/api:v2.1.0 labels: framework: spring-boot# 标签选择器 kubectl get pods -l "app=api" kubectl get pods -l "app=api,version=v2" kubectl get pods -l "app in (api,web)" kubectl get pods -l "app notin (api,web)" kubectl get pods -l "environment=production,tier=backend" kubectl get deployments -l "!release" # 修改标签 kubectl label pods nginx-pod environment=production kubectl label pods nginx-pod version=v2 --overwrite kubectl label pods -l "app=api" team=backend --overwrite3.3 注解
# 注解用于存储非标识性元数据 apiVersion: apps/v1 kind: Deployment metadata: name: web-deployment annotations: # 构建信息 kubernetes.io/change-cause: "Deployment updated to v2.1.0" last-modified-by: "devops-team" # 配置信息 config.example.com/owner: "backend-team" config.example.com/support-email: "backend@example.com" # 监控信息 prometheus.io/scrape: "true" prometheus.io/port: "8080" prometheus.io/path: "/metrics" spec: # ...四、资源调度与伸缩
4.1 HPA - 水平Pod自动伸缩
# HorizontalPodAutoscaler apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: web-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: web-deployment minReplicas: 2 maxReplicas: 10 metrics: # 基于CPU使用率 - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 # 基于内存使用率 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 # 基于自定义指标 - type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: "1000" behavior: scaleDown: stabilizationWindowSeconds: 300 policies: - type: Percent value: 10 periodSeconds: 60 scaleUp: stabilizationWindowSeconds: 0 policies: - type: Percent value: 100 periodSeconds: 15 - type: Pods value: 4 periodSeconds: 15 selectPolicy: Max# 查看HPA状态 kubectl get hpa kubectl describe hpa web-hpa # 手动伸缩 kubectl scale deployment web-deployment --replicas=54.2 VPA - 垂直Pod自动伸缩
# VerticalPodAutoscaler apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: api-vpa spec: targetRef: apiVersion: apps/v1 kind: Deployment name: api-deployment updatePolicy: updateMode: "Auto" # Auto, Off, Initial resourcePolicy: containerPolicies: - containerName: api minAllowed: cpu: 100m memory: 128Mi maxAllowed: cpu: 2 memory: 2Gi controlledResources: ["cpu", "memory"]4.3 资源配额与限制
# ResourceQuota - 命名空间级别资源配额 apiVersion: v1 kind: ResourceQuota metadata: name: compute-quota spec: hard: requests.cpu: "10" requests.memory: 20Gi limits.cpu: "20" limits.memory: 40Gi pods: "50" services: "10" persistentvolumeclaims: "20" --- # LimitRange - Pod/Container资源限制 apiVersion: v1 kind: LimitRange metadata: name: default-limits spec: limits: - max: cpu: "2" memory: 2Gi min: cpu: 50m memory: 64Mi default: cpu: 500m memory: 512Mi defaultRequest: cpu: 200m memory: 256Mi type: Container五、运维实践
5.1 滚动更新与回滚
# 滚动更新 kubectl set image deployment/web-deployment web=myorg/web:v2.0.0 kubectl rollout status deployment/web-deployment # 查看更新历史 kubectl rollout history deployment/web-deployment kubectl rollout history deployment/web-deployment --revision=3 # 回滚到上一版本 kubectl rollout undo deployment/web-deployment # 回滚到指定版本 kubectl rollout undo deployment/web-deployment --to-revision=2# Deployment更新策略详解 spec: strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # 最多超出期望副本数 maxUnavailable: 0 # 不可用Pod数量(建议为0保证服务连续性) # 探针配置影响更新过程 minReadySeconds: 30 # 新Pod就绪后最少运行时间 progressDeadlineSeconds: 600 # 更新超时时间5.2 污点与容忍
# 节点污点 kubectl taint nodes node1 dedicated=gpu:NoSchedule kubectl taint nodes node1 special=true:PreferNoSchedule kubectl taint nodes node1 maintenance=true:NoExecute --overwrite # 查看污点 kubectl describe node node1 | grep -A5 Taints # Pod容忍污点 kubectl taint nodes node1 dedicated=gpu:NoSchedule# Pod配置容忍 apiVersion: apps/v1 kind: Deployment metadata: name: ml-training spec: replicas: 1 selector: matchLabels: app: ml-training template: spec: tolerations: # 匹配NoSchedule污点 - key: "dedicated" operator: "Equal" value: "gpu" effect: "NoSchedule" # 匹配任意污点 - key: "dedicated" operator: "Exists" effect: "NoSchedule" # 匹配任意effect - key: "special" operator: "Exists" nodeSelector: gpu: "true" containers: - name: training image: ml-training:latest resources: requests: nvidia.com/gpu: 1 limits: nvidia.com/gpu: 15.3 亲和性与反亲和性
# Pod反亲和性 - 分散部署 apiVersion: apps/v1 kind: Deployment metadata: name: redis-cluster spec: replicas: 6 selector: matchLabels: app: redis template: metadata: labels: app: redis spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: "app" operator: In values: ["redis"] topologyKey: "kubernetes.io/hostname" containers: - name: redis image: redis:7-alpine ports: - containerPort: 6379# Pod亲和性 - 同节点部署 apiVersion: apps/v1 kind: Deployment metadata: name: logging-agent spec: replicas: 3 template: spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: "app" operator: In values: ["web-app"] topologyKey: "kubernetes.io/hostname" podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: "app" operator: In values: ["logging-agent"] topologyKey: "kubernetes.io/hostname" containers: - name: fluentd image: fluent/fluentd:latest5.4 调度器配置
# Pod优先级与抢占 apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 1000 globalDefault: false description: "High priority for production workloads" --- apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: low-priority value: 100 globalDefault: true description: "Default priority for batch jobs" --- # 使用优先级 apiVersion: apps/v1 kind: Deployment metadata: name: critical-service spec: template: spec: priorityClassName: high-priority containers: - name: app image: myapp:latest# 调度器配置 kube-scheduler --config=/etc/kubernetes/scheduler-config.yaml # Pod调度多选题 kubectl label nodes node1 zone=primary kubectl label nodes node2 zone=secondary kubectl label nodes node1 disk-type=ssd kubectl label nodes node2 disk-type=HDD5.5 集群监控与日志
# Prometheus监控配置 apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: web-app-monitor labels: team: platform spec: selector: matchLabels: app: web endpoints: - port: metrics path: /metrics interval: 15s namespaceSelector: matchNames: - production --- # 日志收集 - Fluentd配置 apiVersion: v1 kind: ConfigMap metadata: name: fluentd-config data: fluent.conf: | <source> @type tail path /var/log/containers/*.log pos_file /var/log/fluentd-containers.log.pos tag kubernetes.* <parse> @type json time_key time time_format %Y-%m-%dT%H:%M:%S.%NZ </parse> </source> <filter kubernetes.**> @type kubernetes_metadata @id kubernetes_metadata </filter> <match kubernetes.**> @type elasticsearch host elasticsearch.logging.svc port 9200 logstash_format true logstash_prefix kubernetes </match>六、生产环境最佳实践
6.1 高可用架构
# 高可用Deployment配置 apiVersion: apps/v1 kind: Deployment metadata: name: ha-web-app spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: web template: spec: affinity: # 反亲和性确保Pod分布在不同节点 podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: web topologyKey: kubernetes.io/hostname # 节点亲和性分散到不同可用区 nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - zone-a - zone-b - zone-c containers: - name: web image: myorg/web:v1 resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "1000m" readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 5 successThreshold: 1 livenessProbe: httpGet: path: /health/live port: 8080 initialDelaySeconds: 30 periodSeconds: 10 terminationGracePeriodSeconds: 606.2 灾难恢复
# 备份策略 # 1. etcd快照 ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key # 2. 恢复集群 ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot.db \ --data-dir=/var/lib/etcd/restore # 3. 资源导出 kubectl get all --all-namespaces -o yaml > all-resources.yaml kubectl get configmaps -n production -o yaml > configmaps.yaml kubectl get secrets -n production -o yaml > secrets.yaml总结
Kubernetes作为云原生时代的核心基础设施,提供了强大的容器编排能力。本文从核心概念出发,详细介绍了Pod、Deployment、Service、ConfigMap、Secret等核心资源对象,以及调度机制、运维实践和最佳配置。
掌握Kubernetes需要理论与实践相结合,建议读者:
- 动手实践:搭建本地集群(Minikube/kind)进行实验
- 深入原理:理解Kubernetes的设计理念和架构
- 关注生产:学习高可用部署、监控告警等运维技能
- 持续学习:关注CNCF生态和Kubernetes版本更新
希望本文能为读者的Kubernetes学习之旅提供系统性的指导。