成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

<table id="icaiu"><dl id="icaiu"></dl></table>

<rt id="icaiu"><acronym id="icaiu"></acronym></rt>

<rt id="icaiu"></rt>

<rt id="icaiu"></rt>

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

信創(chuàng)認證

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考信創(chuàng)認證華為認證廠商認證 IT技術(shù)PMP項目管理免費題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

重新定義可視化：我的 Grafana 設(shè)計之旅

作者：劉俊夏 2025-01-07 14:09:58

開發(fā) 前端

我們還需要搞清楚資源的處理：?哪些需要，哪些不需要；哪些需要優(yōu)化，哪些不需要優(yōu)化；哪些需要監(jiān)控，哪些不需要監(jiān)控。?這些我們搞清楚之后，往后面進行就會比較清晰了。

引言

我們這一篇主要是關(guān)注在我們 Prometheus-Operator 相關(guān) Grafana YAML 文件。因為我這邊不打算使用 Helm 安裝，所以，你懂。

我們還需要搞清楚資源的處理： 哪些需要，哪些不需要；哪些需要優(yōu)化，哪些不需要優(yōu)化；哪些需要監(jiān)控，哪些不需要監(jiān)控。 這些我們搞清楚之后，往后面進行就會比較清晰了。

開始

需要監(jiān)控的部分

應(yīng)用層監(jiān)控

應(yīng)用性能指標：

? 響應(yīng)時間：監(jiān)控API響應(yīng)時間，確保服務(wù)的及時性。

? 吞吐量：請求數(shù)、事務(wù)數(shù)等，評估應(yīng)用的處理能力。

? 錯誤率：監(jiān)控HTTP錯誤碼（如4xx、5xx）及應(yīng)用內(nèi)部錯誤。

業(yè)務(wù)指標：

? 根據(jù)具體業(yè)務(wù)需求，監(jiān)控關(guān)鍵業(yè)務(wù)指標（如用戶注冊數(shù)、訂單量等）。

? 日志監(jiān)控：

a.收集和分析應(yīng)用日志，及時發(fā)現(xiàn)和排查問題。

資源使用情況

CPU 和內(nèi)存使用率：

? 監(jiān)控應(yīng)用實例的CPU和內(nèi)存使用，避免資源瓶頸。

網(wǎng)絡(luò)流量：

? 監(jiān)控入站和出站流量，確保網(wǎng)絡(luò)資源充足。

存儲使用：

? 如果應(yīng)用使用了存儲資源，監(jiān)控存儲的使用情況和性能。

函數(shù)調(diào)用監(jiān)控（Serverless 特有）

? 函數(shù)執(zhí)行次數(shù)：監(jiān)控函數(shù)的調(diào)用頻率，了解負載情況。

? 函數(shù)執(zhí)行時長：確保函數(shù)執(zhí)行時間在預(yù)期范圍內(nèi)。

? 錯誤率：監(jiān)控函數(shù)執(zhí)行失敗的比例，及時發(fā)現(xiàn)問題。

安全監(jiān)控

? 訪問控制：監(jiān)控異常訪問行為，防范潛在的安全威脅。

? 漏洞掃描：定期掃描應(yīng)用和依賴庫的安全漏洞。

不需要監(jiān)控的部分

由于阿里云負責維護基礎(chǔ)設(shè)施和部分組件，以下部分通常不需要自行監(jiān)控：

基礎(chǔ)設(shè)施健康狀況：

? 如底層服務(wù)器、網(wǎng)絡(luò)設(shè)備、存儲設(shè)備的健康狀態(tài)，這些由阿里云負責監(jiān)控和維護。

Kubernetes 控制平面：

? 如 API 服務(wù)器、調(diào)度器、控制器管理器等組件的運行狀況，阿里云會確保其高可用性和穩(wěn)定性。

基礎(chǔ)組件的日志和指標：

? 如etcd、kubelet等組件的日志和性能指標，這些通常由阿里云自動處理。

監(jiān)控設(shè)計的最佳實踐

定義關(guān)鍵指標（KPIs）：

? 明確哪些指標對業(yè)務(wù)和應(yīng)用性能至關(guān)重要，優(yōu)先監(jiān)控這些指標。

設(shè)置告警策略：

? 根據(jù)關(guān)鍵指標設(shè)置合理的閾值和告警策略，確保問題能及時被發(fā)現(xiàn)和處理。

可視化儀表盤：

? 創(chuàng)建直觀的儀表盤，實時展示關(guān)鍵指標，便于監(jiān)控和分析。

定期審查和優(yōu)化：

? 定期回顧監(jiān)控數(shù)據(jù)和策略，根據(jù)業(yè)務(wù)變化和應(yīng)用需求進行優(yōu)化。

在使用阿里云 ACK Serverless 集群時，監(jiān)控重點應(yīng)放在應(yīng)用性能、業(yè)務(wù)指標、資源使用情況以及安全方面。利用阿里云提供的監(jiān)控工具和服務(wù)，可以有效地實現(xiàn)全面的監(jiān)控，同時減輕運維負擔。通過合理的監(jiān)控設(shè)計，可以確保應(yīng)用的穩(wěn)定性和性能，及時響應(yīng)潛在的問題。

Prometheus-Operator Manifests

我們這邊使用的是最新版本的，重點主要兩部分：

CRDs

圖片

這些就是 Prometheus-Operator 會使用的 CRD。

API Resources

圖片

圖片

以上就是我們 Prometheus-Operator 將要使用的所有的 YAML 文件，我們可以分為兩個部分：

API Resources：

? RBAC

? NetworkPolicy

? Service

? ConfigMap

? Secret

? ServiceAccount

? PodDistruptionBudget

? 相關(guān)控制器文件

CRDs：

? ServiceMonitor

? PrometheusRule

? AlertManager

? Prometheus

重點在于 Grafana 和 Prometheus，我們這篇先 Grafana。

Grafana

我們前面的概念講解了我們要監(jiān)控的東西，和不要監(jiān)控的東西，所以，我們這里就直接把不需要的 Dashboard 直接給去掉了，因為集群是自托管的，所以，關(guān)于控制平面還有我們工作節(jié)點相關(guān)的監(jiān)控就不需要了。

Prometheus-Operator 里面默認有很多：

? Alertmanager-overview

? APIserver

? Cluster-total

? Controller-manager

? Grafana-overview

? k8s-resources-custer

? k8s-resources-multicluster

? k8s-resources-namespace

? k8s-resources-node

? k8s-resources-pod

? k8s-resources-workload

? k8s-resources-workload-namespace

? Kubelet

? Namespace-by-pod

? Namespace-by-workload

? Node-cluster-rsrc-use

? Node-rsrc-use

? Node-aix

? Nodes-drawin

? Nodes

? Persistentvolumesusage

? Pod-total

? Prometheus-remote-write

? Prometheus

? Proxy

? Scheduler

? Workload-total

對于 ACK Serverless 集群，由于其無節(jié)點 (Node-less) 和彈性架構(gòu)的特點，很多與傳統(tǒng) Kubernetes 物理節(jié)點相關(guān)的 Dashboard 可能沒有實際意義。

以下是列出的 Dashboard 的分類和建議：

推薦保留的 Dashboard

這些 Dashboard 與 Serverless 集群或核心服務(wù)的監(jiān)控相關(guān)，建議保留：

Alertmanager-overview

? 顯示 Alertmanager 的狀態(tài)和告警相關(guān)信息。

? 如果監(jiān)控系統(tǒng)中使用了 Alertmanager，保留該 Dashboard。

Cluster-total

? 監(jiān)控整個集群的總體資源使用情況和 Pod 狀態(tài)。

? 對于 Serverless 集群，關(guān)注 Pods 和整體負載是有意義的。

Grafana-overview

? 監(jiān)控 Grafana 本身的性能和數(shù)據(jù)源狀態(tài)。

? 適合用于查看 Grafana 的健康狀況。

k8s-resources-namespace

? 監(jiān)控不同命名空間的資源使用情況（如 CPU、內(nèi)存、Pod 數(shù)量）。

? 在 Serverless 集群中，命名空間仍然是資源隔離的主要手段，因此保留。

k8s-resources-pod

? 查看每個 Pod 的資源使用情況。

? Serverless 集群中仍需關(guān)注 Pod 的狀態(tài)和資源消耗。

k8s-resources-workload

? 監(jiān)控工作負載（如 Deployment、StatefulSet）的運行狀況。

? Serverless 集群中工作負載是重點，建議保留。

k8s-resources-workload-namespace

? 按命名空間查看工作負載資源的運行情況。

? 如果有多個命名空間隔離的應(yīng)用，可以保留。

Namespace-by-pod

? 按命名空間查看 Pod 的狀態(tài)和資源。

? 與 k8s-resources-pod 類似，適合用于按命名空間細化監(jiān)控。

Namespace-by-workload

? 按命名空間查看工作負載的運行狀況。

? 與 k8s-resources-workload-namespace 類似，建議保留。

Prometheus-remote-write

? 如果使用 Prometheus 的遠程寫入（比如 GreptimeDB，我們后面會用到）功能，該 Dashboard 用于查看遠程寫入狀態(tài)和性能。

Workload-total

? 查看所有工作負載的總資源使用情況。

? Serverless 集群中關(guān)注工作負載總量和整體消耗，建議保留。

Prometheus

? 監(jiān)控 Prometheus 的自身狀態(tài)（如查詢性能、存儲使用）。

? 如果使用 Prometheus 作為監(jiān)控后端，建議保留。

不建議保留的 Dashboard

這些 Dashboard 與物理節(jié)點（Node）相關(guān)或在 Serverless 架構(gòu)中不適用，建議刪除：

k8s-resources-node

? 顯示每個節(jié)點的資源使用情況。

? Serverless 集群沒有物理節(jié)點，因此沒有意義。

Node-cluster-rsrc-use

? 監(jiān)控節(jié)點在集群中的資源使用情況。

? 同上，Serverless 集群沒有物理節(jié)點，建議刪除。

Node-rsrc-use

? 監(jiān)控單個節(jié)點的資源消耗。

? 同上，無物理節(jié)點時無意義。

Node-aix

? 監(jiān)控運行 AIX 系統(tǒng)的節(jié)點。

? 在 Kubernetes 中通常較少使用，Serverless 集群中無意義。

Nodes-drawin

? 監(jiān)控運行 Darwin（macOS）系統(tǒng)的節(jié)點。

? Serverless 集群中不會使用 macOS 作為節(jié)點，無意義。

Nodes

? 查看所有節(jié)點的狀態(tài)和資源使用。

? Serverless 集群沒有節(jié)點相關(guān)的概念，建議刪除。

Persistentvolumesusage

? 查看持久化卷的使用情況。

? Serverless 集群中通常不會直接使用持久化卷（如 PVC），而是使用外部存儲服務(wù)（如 NAS、OSS），因此可以刪除。

Pod-total

? 聚焦于所有 Pod 的狀態(tài)和資源。

? 如果已經(jīng)保留了 Cluster-total 和 k8s-resources-pod，可以刪除該 Dashboard。

Proxy

? 顯示 Kubernetes 中 kube-proxy 的狀態(tài)。

? Serverless 集群中通常不涉及 kube-proxy，因此可以刪除。

Kubelet

? 監(jiān)控每個節(jié)點上的 kubelet 狀態(tài)。

? Serverless 集群中沒有實際的 kubelet，因此可以刪除。

Scheduler

? 監(jiān)控 Kubernetes 調(diào)度器的性能和任務(wù)分配情況。

? 可以刪除，用不到

部分視需求保留的 Dashboard

這些 Dashboard 可能根據(jù)具體需求決定是否保留：

Controller-manager

? 用于監(jiān)控 Kubernetes 控制器管理器的狀態(tài)。

? Serverless 集群中控制器管理器依然存在，但其重要性可能不高。如果對控制器管理器的性能和狀態(tài)無特殊關(guān)注，可刪除。

k8s-resources-cluster

? 查看整個集群的資源使用情況。

? 如果已經(jīng)保留了 Cluster-total，可以刪除此 Dashboard。

k8s-resources-multicluster

? 監(jiān)控多個集群的資源使用。

? 如果沒有跨集群的需求或 Serverless 集群是單一集群，則可以刪除。

Pod-total

? 如果已經(jīng)保留了 Workload-total 和 k8s-resources-pod，此 Dashboard 可以刪除。

最終整理

保留的 Dashboard

? Alertmanager-overview

? Cluster-total

? Grafana-overview

? k8s-resources-namespace

? k8s-resources-pod

? k8s-resources-workload

? k8s-resources-workload-namespace

? Namespace-by-pod

? Namespace-by-workload

? Prometheus-remote-write

? Workload-total

? Prometheus

刪除的 Dashboard

? k8s-resources-node

? APIserver

? Node-cluster-rsrc-use

? Node-rsrc-use

? Node-aix

? Nodes-drawin

? Nodes

? Persistentvolumesusage

? Proxy

? Kubelet

? Scheduler

可選視需求保留

? Controller-manager

? k8s-resources-cluster

? k8s-resources-multicluster

? Pod-total

然后，這邊需要優(yōu)化或者刪掉一些 Dashboard，這里面有很多都用不到，但是在這之前，我們需要熟悉下 Dashboard 的 JSON 格式的配置，這邊隨便找一個吧，因為這個也是挺重要的，后面我們還會涉及到修改 Dashboard 的 JSON 配置。

Grafana Dashboard JSON 解析

這個就是定義 Grafana Dashboard 的 Config 文件，這里因為我把它折疊了，這樣就比較簡潔了，不然幾萬行……

可以看到類型是 ConfigMapList，解釋下吧： ConfigMapList 是一個包含多個 ConfigMap 對象的列表。它通常在需要一次性查看或操作多個 ConfigMap 的場景下使用，比如通過 kubectl 查詢所有 ConfigMap 時，Kubernetes API 會返回一個 ConfigMapList 對象。

注意：如果你使用 kubectl get confgmaplist -A，是不會有結(jié)果的，因為 ConfigMapList 僅用作數(shù)據(jù)查詢返回和臨時存儲，不會直接定義和應(yīng)用到 Kubernetes 集群中。

圖片

為了方便我們后續(xù)的進行，我們必須要熟悉 Grafana Dashboard 的 JSON 文件，因為后續(xù)需要修改和改進，這邊隨便找一個吧，非常多，大家謹慎觀看 ?? ，沒事，后面有解析：

{
          "graphTooltip": 1,
          "panels": [
              {
                  "collapsed": false,
                  "gridPos": {
                      "h": 1,
                      "w": 24,
                      "x": 0,
                      "y": 0
                  },
                  "id": 1,
                  "panels": [

                  ],
                  "title": "CPU",
                  "type": "row"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 0,
                      "y": 1
                  },
                  "id": 2,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_cpu_utilisation:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Utilisation"
                      }
                  ],
                  "title": "CPU Utilisation",
                  "type": "timeseries"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 12,
                      "y": 1
                  },
                  "id": 3,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_load1_per_cpu:ratio{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Saturation"
                      }
                  ],
                  "title": "CPU Saturation (Load1 per CPU)",
                  "type": "timeseries"
              },
              {
                  "collapsed": false,
                  "gridPos": {
                      "h": 1,
                      "w": 24,
                      "x": 0,
                      "y": 8
                  },
                  "id": 4,
                  "panels": [

                  ],
                  "title": "Memory",
                  "type": "row"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 0,
                      "y": 9
                  },
                  "id": 5,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_memory_utilisation:ratio{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Utilisation"
                      }
                  ],
                  "title": "Memory Utilisation",
                  "type": "timeseries"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "rds"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 12,
                      "y": 9
                  },
                  "id": 6,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_vmstat_pgmajfault:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Major page Faults"
                      }
                  ],
                  "title": "Memory Saturation (Major Page Faults)",
                  "type": "timeseries"
              },
              {
                  "collapsed": false,
                  "gridPos": {
                      "h": 1,
                      "w": 24,
                      "x": 0,
                      "y": 16
                  },
                  "id": 7,
                  "panels": [

                  ],
                  "title": "Network",
                  "type": "row"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "Bps"
                      },
                      "overrides": [
                          {
                              "matcher": {
                                  "id": "byRegexp",
                                  "options": "/Transmit/"
                              },
                              "properties": [
                                  {
                                      "id": "custom.transform",
                                      "value": "negative-Y"
                                  }
                              ]
                          }
                      ]
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 0,
                      "y": 17
                  },
                  "id": 8,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_network_receive_bytes_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Receive"
                      },
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_network_transmit_bytes_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Transmit"
                      }
                  ],
                  "title": "Network Utilisation (Bytes Receive/Transmit)",
                  "type": "timeseries"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "Bps"
                      },
                      "overrides": [
                          {
                              "matcher": {
                                  "id": "byRegexp",
                                  "options": "/Transmit/"
                              },
                              "properties": [
                                  {
                                      "id": "custom.transform",
                                      "value": "negative-Y"
                                  }
                              ]
                          }
                      ]
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 12,
                      "y": 17
                  },
                  "id": 9,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_network_receive_drop_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Receive"
                      },
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance:node_network_transmit_drop_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "Transmit"
                      }
                  ],
                  "title": "Network Saturation (Drops Receive/Transmit)",
                  "type": "timeseries"
              },
              {
                  "collapsed": false,
                  "gridPos": {
                      "h": 1,
                      "w": 24,
                      "x": 0,
                      "y": 24
                  },
                  "id": 10,
                  "panels": [

                  ],
                  "title": "Disk IO",
                  "type": "row"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 0,
                      "y": 25
                  },
                  "id": 11,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "{{device}}"
                      }
                  ],
                  "title": "Disk IO Utilisation",
                  "type": "timeseries"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 12,
                      "x": 12,
                      "y": 25
                  },
                  "id": 12,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
                          "legendFormat": "{{device}}"
                      }
                  ],
                  "title": "Disk IO Saturation",
                  "type": "timeseries"
              },
              {
                  "collapsed": false,
                  "gridPos": {
                      "h": 1,
                      "w": 24,
                      "x": 0,
                      "y": 34
                  },
                  "id": 13,
                  "panels": [

                  ],
                  "title": "Disk Space",
                  "type": "row"
              },
              {
                  "datasource": {
                      "type": "prometheus",
                      "uid": "${datasource}"
                  },
                  "fieldConfig": {
                      "defaults": {
                          "custom": {
                              "fillOpacity": 100,
                              "showPoints": "never",
                              "stacking": {
                                  "mode": "normal"
                              }
                          },
                          "unit": "percentunit"
                      }
                  },
                  "gridPos": {
                      "h": 7,
                      "w": 24,
                      "x": 0,
                      "y": 35
                  },
                  "id": 14,
                  "options": {
                      "legend": {
                          "showLegend": false
                      },
                      "tooltip": {
                          "mode": "multi",
                          "sort": "desc"
                      }
                  },
                  "pluginVersion": "v11.4.0",
                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "$datasource"
                          },
                          "expr": "sort_desc(1 -\n  (\n    max without (mountpoint, fstype) (node_filesystem_avail_bytes{job=\"node-exporter\", fstype!=\"\", instance=\"$instance\", cluster=\"$cluster\"})\n    /\n    max without (mountpoint, fstype) (node_filesystem_size_bytes{job=\"node-exporter\", fstype!=\"\", instance=\"$instance\", cluster=\"$cluster\"})\n  ) != 0\n)\n",
                          "legendFormat": "{{device}}"
                      }
                  ],
                  "title": "Disk Space Utilisation",
                  "type": "timeseries"
              }
          ],
          "refresh": "30s",
          "schemaVersion": 39,
          "tags": [
              "node-exporter-mixin"
          ],
          "templating": {
              "list": [
                  {
                      "name": "datasource",
                      "query": "prometheus",
                      "type": "datasource"
                  },
                  {
                      "datasource": {
                          "type": "prometheus",
                          "uid": "${datasource}"
                      },
                      "hide": 2,
                      "includeAll": false,
                      "name": "cluster",
                      "query": "label_values(node_time_seconds, cluster)",
                      "refresh": 2,
                      "sort": 1,
                      "type": "query"
                  },
                  {
                      "datasource": {
                          "type": "prometheus",
                          "uid": "${datasource}"
                      },
                      "name": "instance",
                      "query": "label_values(node_exporter_build_info{job=\"node-exporter\", cluster=\"$cluster\"}, instance)",
                      "refresh": 2,
                      "sort": 1,
                      "type": "query"
                  }
              ]
          },
          "time": {
              "from": "now-1h",
              "to": "now"
          },
          "timezone": "utc",
          "title": "Node Exporter / USE Method / Node",
          "uid": "fac67cfbe174d3ef53eb473d73d9212f"
      }

概覽

這個 JSON 配置定義了一個名為 "Node Exporter / USE Method / Node" 的 Grafana Dashboard。它包含多個監(jiān)控面板（Panels），每個面板展示不同的系統(tǒng)性能指標，如 CPU、內(nèi)存、網(wǎng)絡(luò)、磁盤 I/O 和磁盤空間的使用情況。

主要配置參數(shù)

? graphTooltip: 控制工具提示的顯示方式。1 表示在鼠標懸停時顯示所有數(shù)據(jù)點的詳細信息。

? refresh: 設(shè)置 Dashboard 的自動刷新頻率為每 30 秒。

? schemaVersion: 表示 Grafana Dashboard 的 schema 版本，這里是 39。

? tags: 給 Dashboard 添加標簽，這里是 node-exporter-mixin，便于分類和搜索。

? templating: 定義了變量，用于動態(tài)選擇數(shù)據(jù)源、集群和實例。

? time: 默認的時間范圍設(shè)置為過去 1 小時 (from: "now-1h") 到現(xiàn)在 (to: "now")，時區(qū)為 UTC。

? title: Dashboard 的標題。

? uid: Dashboard 的唯一標識符。

模板變量（Templating Variables）

模板變量允許在 Dashboard 中動態(tài)選擇不同的數(shù)據(jù)源、集群和實例，從而使 Dashboard 更加靈活和可復(fù)用。

定義的變量

? 類型: 數(shù)據(jù)源選擇器。

? 查詢: 固定為 prometheus，用戶可以選擇不同的 Prometheus 數(shù)據(jù)源。

? 數(shù)據(jù)源: 使用 ${datasource} 變量指定的數(shù)據(jù)源。

? 查詢: label_values(node_time_seconds, cluster)，獲取所有集群名稱。

? 隱藏: 類型 2 表示在 UI 中隱藏這個變量。

? 數(shù)據(jù)源: 使用 ${datasource} 變量指定的數(shù)據(jù)源。

? 查詢: label_values(node_exporter_build_info{job="node-exporter", cluster="$cluster"}, instance)，根據(jù)選定的集群獲取對應(yīng)的實例名稱。這些變量在面板的 Prometheus 查詢中以、cluster 和 $instance 的形式被引用，用于動態(tài)過濾數(shù)據(jù)。

面板結(jié)構(gòu)（Panels）

Dashboard 中的面板分為幾個主要部分，每個部分通過一個折疊行（Row）進行組織，下面詳細解釋每個部分和其包含的面板。

CPU 監(jiān)控

標題行

{
    "collapsed": false,
    "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 },
    "id": 1,
    "panels": [],
    "title": "CPU",
    "type": "row"
}

? 作用: 作為 CPU 監(jiān)控面板的標題，便于視覺上的分組。

? 屬性:

a.collapsed: false 表示該行是展開的。

b.gridPos: 定義面板在網(wǎng)格中的位置和大小。

{
    "datasource": { "type": "prometheus", "uid": "
${datasource}" },
    "fieldConfig": { ... },
    "gridPos": { "h": 7, "w": 12, "x": 0, "y": 1 },
    "id": 2,
    "options": { ... },
    "pluginVersion": "v11.4.0",
    "targets": [
        {
            "expr": "instance:node_cpu_utilisation:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
            "legendFormat": "Utilisation"
        }
    ],
    "title": "CPU Utilisation",
    "type": "timeseries"
}

? 作用: 展示 CPU 利用率的時間序列圖。

? 主要配置:

a.datasource: 使用定義的 Prometheus 數(shù)據(jù)源。

b.expr: Prometheus 查詢語句，用于計算 CPU 利用率的 5 分鐘平均速率。

? 查詢解釋:

a.instance:node_cpu_utilisation:rate5m: 自定義的 Prometheus 指標，表示每個實例的 CPU 利用率。

b.{job="node-exporter", instance="cluster"}: 過濾條件，根據(jù)選擇的實例和集群。

c.!= 0: 過濾掉值為 0 的數(shù)據(jù)點。

? legendFormat: 圖例格式，這里顯示為 "Utilisation"。

? fieldConfig: 配置字段的顯示方式，包括填充透明度、是否顯示數(shù)據(jù)點、堆疊模式和單位（百分比）。

? options: 配置圖例顯示和工具提示模式。

? type: 圖表類型為 timeseries。

{
    "expr": "instance:node_load1_per_cpu:ratio{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "Saturation"
    ...
    "title": "CPU Saturation (Load1 per CPU",
    "type": "timeseries"
}

? 作用: 展示每個 CPU 的 1 分鐘負載比例，用于評估 CPU 的飽和度。

? 主要配置:

a.expr: 查詢每個實例每個 CPU 的 1 分鐘負載比率。

b.legendFormat: 圖例顯示為 "Saturation"。

Memory 監(jiān)控

標題行

{
    "collapsed": false,
    "gridPos": { "h": 1, "w": 24, "x": 0, "y": 8 },
    "id": 4,
    "panels": [],
    "title": "Memory",
    "type": "row"
}

? 作用: 作為內(nèi)存監(jiān)控面板的標題。

{
    "expr": "instance:node_memory_utilisation:ratio{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "Utilisation",
    ...
    "title": "Memory Utilisation",
    "type": "timeseries"
}

? 作用: 展示內(nèi)存利用率的時間序列圖。

? 主要配置:

a.expr: 查詢內(nèi)存利用率比率。

b.legendFormat: 圖例顯示為 "Utilisation"。

{
    "expr": "instance:node_vmstat_pgmajfault:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "Major page Faults",
    ...
    "title": "Memory Saturation (Major Page Faults)",
    "type": "timeseries"
}

? 作用: 監(jiān)控主頁面錯誤率，反映內(nèi)存飽和度。

? 主要配置:

a.expr: 查詢每個實例的主頁面錯誤速率。

b. legendFormat: 圖例顯示為 "Major page Faults"。

Network 監(jiān)控

標題行

{
    "collapsed": false,
    "gridPos": { "h": 1, "w": 24, "x": 0, "y": 16 },
    "id": 7,
    "panels": [],
    "title": "Network",
    "type": "row"
}

? 作用: 作為網(wǎng)絡(luò)監(jiān)控面板的標題。

{
    "expr": "instance:node_network_receive_bytes_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "Receive",
    {
        "expr": "instance:node_network_transmit_bytes_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
        "legendFormat": "Transmit"
    },
    ...
    "title": "Network Utilisation (Bytes Receive/Transmit)",
    "type": "timeseries",
    "fieldConfig": {
        "overrides": [
            {
                "matcher": { "id": "byRegexp", "options": "/Transmit/" },
                "properties": [
                    { "id": "custom.transform", "value": "negative-Y" }
                ]
            }
        ]
    }
}

? 作用: 展示網(wǎng)絡(luò)接收和發(fā)送字節(jié)數(shù)的時間序列圖。

? 主要配置:

a.expr: 兩個查詢分別獲取接收（Receive）和發(fā)送（Transmit）的字節(jié)速率。

b.legendFormat: 分別顯示為 "Receive" 和 "Transmit"。

c.fieldConfig.overrides: 將 "Transmit" 數(shù)據(jù)轉(zhuǎn)換為負值（negative-Y），以便在圖表中與接收數(shù)據(jù)對稱顯示。

{
    "expr": "instance:node_network_receive_drop_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "Receive",
    {
        "expr": "instance:node_network_transmit_drop_excluding_lo:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
        "legendFormat": "Transmit"
    },
    ...
    "title": "Network Saturation (Drops Receive/Transmit)",
    "type": "timeseries",
    "fieldConfig": {
        "overrides": [
            {
                "matcher": { "id": "byRegexp", "options": "/Transmit/" },
                "properties": [
                    { "id": "custom.transform", "value": "negative-Y" }
                ]
            }
        ]
    }
}

? 作用: 監(jiān)控網(wǎng)絡(luò)接收和發(fā)送丟包數(shù)的時間序列圖。

? 主要配置:

a.expr: 兩個查詢分別獲取接收和發(fā)送的丟包速率。

b.legendFormat: 分別顯示為 "Receive" 和 "Transmit"。

c.fieldConfig.overrides: 同樣將 "Transmit" 數(shù)據(jù)轉(zhuǎn)換為負值，以便與接收數(shù)據(jù)對稱顯示。

Disk IO 監(jiān)控

標題行

{
    "collapsed": false,
    "gridPos": { "h": 1, "w": 24, "x": 0, "y": 24 },
    "id": 10,
    "panels": [],
    "title": "Disk IO",
    "type": "row"
}

? 作用: 作為磁盤 I/O 監(jiān)控面板的標題。

{
    "expr": "instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "{{device}}",
    ...
    "title": "Disk IO Utilisation",
    "type": "timeseries"
}

? 作用: 展示磁盤 I/O 時間利用率的時間序列圖。

? 主要配置:

a.expr: 查詢每個設(shè)備的 I/O 時間速率。

b.legendFormat: 使用設(shè)備名稱（{{device}}）作為圖例。

{
    "expr": "instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\", instance=\"$instance\", cluster=\"$cluster\"} != 0",
    "legendFormat": "{{device}}",
    ...
    "title": "Disk IO Saturation",
    "type": "timeseries"
}

? 作用: 監(jiān)控加權(quán)的磁盤 I/O 時間，反映 I/O 飽和度。

? 主要配置:

a.expr: 查詢每個設(shè)備的加權(quán) I/O 時間速率。

b. legendFormat: 使用設(shè)備名稱作為圖例。

Disk Space 監(jiān)控

標題行

{
    "collapsed": false,
    "gridPos": { "h": 1, "w": 24, "x": 0, "y": 34 },
    "id": 13,
    "panels": [],
    "title": "Disk Space",
    "type": "row"
}

? 作用: 作為磁盤空間監(jiān)控面板的標題。

{
    "expr": "sort_desc(1 -\n  (\n    max without (mountpoint, fstype) (node_filesystem_avail_bytes{job=\"node-exporter\", fstype!=\"\", instance=\"$instance\", cluster=\"$cluster\"})\n    /\n    max without (mountpoint, fstype) (node_filesystem_size_bytes{job=\"node-exporter\", fstype!=\"\", instance=\"$instance\", cluster=\"$cluster\"})\n  ) != 0\n)\n",
    "legendFormat": "{{device}}",
    ...
    "title": "Disk Space Utilisation",
    "type": "timeseries"
}

? 作用: 展示磁盤空間利用率的時間序列圖。

? 主要配置:

expr: 復(fù)雜的 Prometheus 查詢，用于計算磁盤空間的使用率。

1）查詢解釋:

? node_filesystem_avail_bytes: 可用磁盤空間字節(jié)數(shù)。

? node_filesystem_size_bytes: 磁盤總空間字節(jié)數(shù)。

? 計算方法: 1 - (可用空間 / 總空間)，即已用空間比例。

? sort_desc: 將結(jié)果按降序排序。

? != 0: 過濾掉值為 0 的數(shù)據(jù)點。

2） legendFormat: 使用設(shè)備名稱作為圖例。

面板配置詳解

每個面板的配置結(jié)構(gòu)大致相同，以下是各主要配置項的解釋：

datasource

? 描述: 定義該面板使用的數(shù)據(jù)源，這里統(tǒng)一使用模板變量 ${datasource} 指定的 Prometheus 數(shù)據(jù)源。

? 格式:

"datasource": {
    "type": "prometheus",
    "uid": "${datasource}"
}

fieldConfig

? 描述: 配置字段的顯示屬性，包括默認設(shè)置和自定義覆蓋。

? 主要配置:

1）defaults: 默認字段配置。

? custom.fillOpacity: 填充透明度，值為 100 表示完全不透明。

? custom.showPoints: 是否顯示數(shù)據(jù)點，這里設(shè)置為 "never"，即不顯示。

? custom.stacking.mode: 堆疊模式，這里設(shè)置為 "normal"，表示正常堆疊。

? unit: 數(shù)據(jù)的單位，如 percentunit（百分比）、Bps（字節(jié)每秒）等。

2） overrides: 允許對特定條件下的字段進行覆蓋配置。例如，將 "Transmit" 數(shù)據(jù)轉(zhuǎn)換為負值。

gridPos

? 描述: 定義面板在 Dashboard 網(wǎng)格中的位置和大小。

? 屬性:

a.h: 高度（單位為網(wǎng)格行數(shù)）。

b.w: 寬度（單位為網(wǎng)格列數(shù)）。

c.x: 水平起始位置（網(wǎng)格列索引）。

d.y: 垂直起始位置（網(wǎng)格行索引）。

targets

? 描述: 定義數(shù)據(jù)查詢的目標，這里主要是 Prometheus 查詢。

? 屬性:

a.expr: Prometheus 查詢表達式。

b.legendFormat: 圖例格式，用于標識不同數(shù)據(jù)系列。

options

? 描述: 定義圖表的顯示選項。

? 主要配置:

1）legend.showLegend: 是否顯示圖例，這里設(shè)置為 false，即不顯示。

2） tooltip: 工具提示的顯示模式。

a. mode: "multi" 表示顯示多個數(shù)據(jù)系列的工具提示。

b.sort: "desc" 表示按降序排序數(shù)據(jù)。

type

? 描述: 定義圖表的類型，這里主要使用 timeseries，表示時間序列圖。

看完之后，為了加深印象，建議多看幾個，大體都是相同的，只不過有一些參數(shù)會不一樣，有什么不懂的，就直接問 AI ，很方便，要善用工具，不然你會被淘汰。

如果后面需要定制化，那你也可以得心應(yīng)手。

擴展

我這里還需要再增加額外的 Dashboard，使用 JSON 格式的文件，我就直接在 YAML 文件里面定義了。

ArgoCD

ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: argocd-servicemonitor
  namespace: monitoring
  labels:
    app.kubernetes.io/name: argocd
    app.kubernetes.io/part-of: argocd
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: argocd-server  # 需要匹配 ArgoCD 服務(wù)的標簽
  namespaceSelector:
    matchNames:
      - argocd  # ArgoCD 所在的命名空間
  endpoints:
    - port: metrics  # Prometheus 監(jiān)控的端口
      path: /metrics      # 監(jiān)控端點路徑
      interval: 30s       # 采樣間隔
      scrapeTimeout: 10s  # 超時時間
      tlsConfig:          # 如果需要 TLS 加密，啟用以下配置
        insecureSkipVerify: true

一般而言：

? ServiceMonitor 用于監(jiān)控對應(yīng) Service 背后的 Pod 的 Metrics，比較適合被監(jiān)控 Pod 有一致的 Service 的場景；

? PodMonitor 用于監(jiān)控對應(yīng) Labels 下背后 Pod 的 Metrics，比較適合被監(jiān)控 Pod 沒有 Service 且多個 Pod 部署規(guī)則并不統(tǒng)一的場景；

Dashboard JSON 文件

這個太大了，我就不展示了，大家需要的話，可以到這個地址^[1]。

CoreDNS

ServiceMonitor

Prometheus-Operator 自帶，所以這邊就不用做什么了，只需要關(guān)注 Dashboard 的配置了。

Dashboard JSON 文件

這個太大了，我就不展示了，大家需要的話，可以到這個地址^[2]。

修改配置文件

我們需要配置我們的 ConfigMap 然后還有我們的控制器文件，主要是需要把我們新添加的 Dasboard 掛載到 Grafana 里面，我這里演示一個，后續(xù)需要更多，都可以照著這個做：

圖片

可以看到我另外添加了兩個，這個時候我們就需要把它掛載到 Grafana 里面了，還有它默認是 Deployment，我這里需要持久化，所以就修改成了 StatefulSet。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 11.4.0
  name: grafana
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: grafana
      app.kubernetes.io/name: grafana
      app.kubernetes.io/part-of: kube-prometheus
  template:
    metadata:
      annotations:
        checksum/grafana-config: cb0d6303ddbb694464bde843b0fe874c
        checksum/grafana-dashboardproviders: ca302ceedc58d72663436a77e5e0ea29
        checksum/grafana-datasources: b748e773cdfff19dcfe874d29600675b
      labels:
        app.kubernetes.io/component: grafana
        app.kubernetes.io/name: grafana
        app.kubernetes.io/part-of: kube-prometheus
        app.kubernetes.io/version: 11.4.0
    spec:
      automountServiceAccountToken: false
      containers:
      - env: []
        image: grafana/grafana:11.4.0
        name: grafana
        ports:
        - containerPort: 3000
          name: http
        readinessProbe:
          httpGet:
            path: /api/health
            port: http
        resources:
          limits:
            cpu: 200m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          seccompProfile:
            type: RuntimeDefault
        volumeMounts:
        - mountPath: /var/lib/grafana
          name: grafana-storage
          readOnly: false
        - mountPath: /etc/grafana/provisioning/datasources
          name: grafana-datasources
          readOnly: false
        - mountPath: /etc/grafana/provisioning/dashboards
          name: grafana-dashboards
          readOnly: false
        - mountPath: /tmp
          name: tmp-plugins
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/cluster-total
          name: grafana-dashboard-cluster-total
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/grafana-overview
          name: grafana-dashboard-grafana-overview
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-cluster
          name: grafana-dashboard-k8s-resources-cluster
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-namespace
          name: grafana-dashboard-k8s-resources-namespace
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-node
          name: grafana-dashboard-k8s-resources-node
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-pod
          name: grafana-dashboard-k8s-resources-pod
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-workload
          name: grafana-dashboard-k8s-resources-workload
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/k8s-resources-workloads-namespace
          name: grafana-dashboard-k8s-resources-workloads-namespace
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/namespace-by-pod
          name: grafana-dashboard-namespace-by-pod
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/namespace-by-workload
          name: grafana-dashboard-namespace-by-workload
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/prometheus-remote-write
          name: grafana-dashboard-prometheus-remote-write
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/workload-total
          name: grafana-dashboard-workload-total
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/prometheus
          name: grafana-dashboard-prometheus
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/argocd     # 我們這里需要掛載上去
          name: grafana-dashboard-argocd
          readOnly: false
        - mountPath: /grafana-dashboard-definitions/0/coredns
          name: grafana-dashboard-coredns
          readOnly: false
        - mountPath: /etc/grafana
          name: grafana-config
          readOnly: false
      nodeSelector:
        kubernetes.io/os: linux
      securityContext:
        fsGroup: 65534
        runAsGroup: 65534
        runAsNonRoot: true
        runAsUser: 65534
      serviceAccountName: grafana
      volumes:
      - name: grafana-datasources
        secret:
          secretName: grafana-datasources
      - configMap:
          name: grafana-dashboards
        name: grafana-dashboards
      - emptyDir:
          medium: Memory
        name: tmp-plugins
      - configMap:
          name: grafana-dashboard-cluster-total
        name: grafana-dashboard-cluster-total
      - configMap:
          name: grafana-dashboard-grafana-overview
        name: grafana-dashboard-grafana-overview
      - configMap:
          name: grafana-dashboard-k8s-resources-cluster
        name: grafana-dashboard-k8s-resources-cluster
      - configMap:
          name: grafana-dashboard-k8s-resources-namespace
        name: grafana-dashboard-k8s-resources-namespace
      - configMap:
          name: grafana-dashboard-k8s-resources-node
        name: grafana-dashboard-k8s-resources-node
      - configMap:
          name: grafana-dashboard-k8s-resources-pod
        name: grafana-dashboard-k8s-resources-pod
      - configMap:
          name: grafana-dashboard-k8s-resources-workload
        name: grafana-dashboard-k8s-resources-workload
      - configMap:
          name: grafana-dashboard-k8s-resources-workloads-namespace
        name: grafana-dashboard-k8s-resources-workloads-namespace
      - configMap:
          name: grafana-dashboard-namespace-by-pod
        name: grafana-dashboard-namespace-by-pod
      - configMap:
          name: grafana-dashboard-prometheus-remote-write
        name: grafana-dashboard-prometheus-remote-write
      - configMap:
          name: grafana-dashboard-namespace-by-workload
        name: grafana-dashboard-namespace-by-workload
      - configMap:
          name: grafana-dashboard-workload-total
        name: grafana-dashboard-workload-total
      - configMap:
          name: grafana-dashboard-alertmanager-overview
        name: grafana-dashboard-alertmanager-overview
      - configMap:
          name: grafana-dashboard-prometheus
        name: grafana-dashboard-prometheus
      - configMap:                    # 以下是我們新添加的，我們這里定義好，上面就可以掛載上去
          name: grafana-dashboard-argocd
        name: grafana-dashboard-argocd
      - configMap:
          name: grafana-dashboard-coredns
        name: grafana-dashboard-coredns
      - name: grafana-config
        secret:
          secretName: grafana-config

然后我們這里需要持久化數(shù)據(jù)：

volumeClaimTemplates:
    - metadata:
        name: grafana-storage
      spec:
        storageClassName: alicloud-nas-subpath
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 15Gi

我這里還需要配置下 Grafana 的 Config 文件，主要是做一些優(yōu)化，把初始化密碼定義下：

apiVersion: v1
kind: Secret
metadata:
  labels:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 11.4.0
  name: grafana-config
  namespace: monitoring
stringData:
  grafana.ini: |
    [date_formats]
    default_timezone = UTC

    [security]
    admin_user = admin
    admin_password = j019e99392129
type: Opaque

然后考慮到我們后續(xù)還需要實現(xiàn)相應(yīng)的 Grafana Reporter 自動化 PDF 報告生成，所以這邊就直接優(yōu)化了：

[rendering]
    concurrent_render_request_limit = 70

結(jié)語

后續(xù)有些細節(jié)還需要再優(yōu)化下，比如 Dashboard 的展示數(shù)據(jù)有問題，就需要我們就行修改和優(yōu)化。

我們的 Grafana 之路到此為止就算結(jié)束了。

但是這才是剛剛開始，一個偉大的開始。

引用鏈接

[1] 地址: https://grafana.com/grafana/dashboards/14584-argocd/[2] 地址: https://grafana.com/grafana/dashboards/14981-coredns/

責任編輯：武曉燕來源：云原生運維圈

Grafana 定義可視化

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

主站蜘蛛池模板：欧美日韩国产一区二区三区 | 色999视频 | 成人免费看黄网站在线观看 | a毛片视频网站 | 国产精品毛片av | 亚洲九九精品 | 亚洲国产成人精品久久 | 久久99这里只有精品 | 久久伊人影院 | 亚洲精品一二区 | 久久精品免费 | 成人精品一区二区三区中文字幕 | 国产精品高潮呻吟久久aⅴ码 | 亚洲欧美一区二区三区国产精品 | 国产伦精品一区二区三区在线 | 精品欧美一区二区三区久久久 | 在线观看视频你懂得 | 一级一级一级毛片 | 一区二区精品 | 日本爱爱视频 | 色婷婷精品久久二区二区蜜臂av | 日日拍夜夜 | 91久久久久久久久久久 | 99色播 | 欧美一区二区在线观看 | 综合网中文字幕 | 免费电影av | 国产一区二区 | 黄色免费网站在线看 | 蜜臀久久99精品久久久久久宅男 | 欧美成人影院 | 国产福利观看 | 一区二区三区四区日韩 | 中文日韩在线视频 | 午夜影院| 香蕉国产在线视频 | 一级毛片黄片 | 国产精品久久久久久久久久久新郎 | 日韩视频在线免费观看 | 色综合久久天天综合网 | 成人在线视频免费看 |

<strike id="i880m"><tr id="i880m"></tr></strike>

<code id="i880m"></code>

<button id="i880m"></button>

<rt id="i880m"><delect id="i880m"></delect></rt>

<bdo id="i880m"></bdo>

<strike id="i880m"></strike>

<li id="i880m"></li>