成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

Prometheus與傳統進程、端口與內網域名檢查

開發 前端
公司A通過采用Prometheus作為監控工具,成功地實現了對Windows和Linux平臺上端口、進程和內網域名狀態的監控。通過精心設計的流程,包括工具選擇與部署、配置端口和進程監控、告警機制的建立、以及數據可視化和優化,公司能夠確保其IT基礎設施的穩定性和安全性。

背   景

公司A面臨著監控其IT基礎設施的需求,包括Windows和Linux平臺上的端口、進程和內網域名狀態。隨著業務的增長,維護系統的穩定性和安全性變得尤為重要。傳統的監控方法可能實時性和靈活性不夠好,因此采用Prometheus監控工具,以便更高效地獲取系統狀態和性能指標。

目  標

  1. 實現跨平臺監控:選擇合適的監控工具,能夠在Windows和Linux上均可安裝和運行。
  2. 實時監控端口和進程:對關鍵服務的端口和進程進行實時監控,確保服務可用性,并及時告警。
  3. 內網域名狀態檢查:定期檢查內網域名的解析和可達性,確保內部服務的正常運行。
  4. SSL證書監控:自動檢查SSL證書的有效性和到期時間,確保證書時間健康。
  5. 數據可視化與告警:通過可視化工具(如Grafana)展示監控數據,并配置告警機制,以便及時通知監控與開發相關技術人員。

業務流程

  • 工具選擇與部署

選擇Prometheus作為監控系統,并利用適配器(如node_exporter、process_exporter、自定義port采集器)來滿足不同的監控需求。

在Windows和Linux服務器上部署Prometheus及其相關Exporter。

  • 配置端口和進程監控

使用node_exporter來監控系統的端口和進程狀態。

配置Prometheus以抓取node_exporter提供的指標,確保可以監控特定的端口和進程。

  • 告警機制

采用exporter來實現進程和端口狀態檢查。配置HTTP探測,確保服務的可用性。

將Prometheus數據源連接到Grafana,創建儀表盤以可視化監控數據。

  • 測試與優化

在實施后進行測試,確保監控系統能夠準確捕捉到各項指標。

根據反饋進行優化,調整監控策略和告警規則,確保系統的高效運行。

Prometheus進程、端口配置

①Dockerfile編譯配置

cat Dockerfile
FROM python
ENV LANG=C.UTF-8
ENV TZ=Asia/Shanghai
RUN pip install pyyaml   --upgrade -i  https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install requests --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install prometheus_client  -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install Flask              -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install pyyaml             -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install asyncio            -i https://pypi.tuna.tsinghua.edu.cn/simple
COPY host_port_monitor.py   /opt/
CMD ["sleep","999"]

#編譯與推送鏡像到倉庫
docker build -t harbor.export.cn/ops/service_status_monitor_export_port:v2 .
docker push  harbor.export.cn/ops/service_status_monitor_export_port:v2

②配置特定端口采集器與端口標簽規范化

cat host_port_monitor.py
# -*- coding:utf-8 -*-
import socket
import os
import yaml
import prometheus_client
from prometheus_client import Gauge
from prometheus_client.core import CollectorRegistry
from flask import Response, Flask
import re
import asyncio


app = Flask(__name__)


def get_config_dic():
    """
    Load YAML config file and return it as a dictionary.
    """
    pro_path = os.path.dirname(os.path.realpath(__file__))
    yaml_path = os.path.join(pro_path, "host_port_conf.yaml")
    with open(yaml_path, "r", encoding="utf-8") as f:
        sdata = yaml.full_load(f)
    return sdata


async def explore_udp_port(ip, port):
    try:
        loop = asyncio.get_running_loop()
        transport, protocol = await loop.create_datagram_endpoint(
            lambda: UDPProbe(),
            remote_addr=(ip, port)
        )
        transport.close()
        return 1
    except Exception:
        return 0


class UDPProbe:
    def connection_made(self, transport):
        self.transport = transport
        self.transport.sendto(b'test')


    def datagram_received(self, data, addr):
        self.transport.close()


    def error_received(self, exc):
        pass


    def connection_lost(self, exc):
        pass


def explore_tcp_port(ip, port):
    """
    Check if the TCP port is open on the given IP.
    """
    try:
        tel = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        tel.connect((ip, int(port)))
        socket.setdefaulttimeout(0.5)
        return 1
    except:
        return 0


def is_valid_label_name(label_name):
    """
    Check if the label name is valid according to Prometheus conventions.
    """
    return re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*$', label_name) is not None




def format_label_name(label_name):
    """
    Format invalid label_name to valid ones by replacing invalid characters.
    """
    return re.sub(r'[^a-zA-Z0-9_]', '_', label_name)




def check_port():
    """
    Check the ports for all configured services and apply different requirements.
    """
    sdic = get_config_dic()
    result_list = []
    for sertype, config in sdic.items():
        iplist = config.get("host")
        portlist = config.get("port")
        requirement = config.get("requirement")
        protocol_list = config.get("protocol", ["tcp"])


        # Extract dynamic labels and filter valid ones
        dynamic_labels = {key: value for key, value in config.items() if key not in ['host', 'port', 'requirement', 'protocol']}
        valid_labels = {format_label_name(key): value for key, value in dynamic_labels.items() if is_valid_label_name(key)}


        status_all = True
        for ip in iplist:
            for port in portlist:
                for protocol in protocol_list:
                    if protocol == "tcp":
                        status = explore_tcp_port(ip, port)
                    elif protocol == "udp":
                        status = asyncio.run(explore_udp_port(ip, port))
                    else:
                        status = explore_tcp_port(ip, port)
                    result_dic = {"sertype": sertype, "host": ip, "port": str(port), "status": status}
                    # Merge valid dynamic labels into the result dictionary
                    result_dic.update(valid_labels)
                    result_list.append(result_dic)


                    if requirement == "all":
                        status_all = status_all and result_dic["status"]
                    elif requirement == "any":
                        if result_dic["status"]:
                            status_all = True
                            break


        if requirement in ["all", "any"]:
            for result in result_list:
                if result["sertype"] == sertype:
                    result["status"] = int(status_all)


    return result_list




@app.route("/metrics")
def api_response():
    """
    Generate Prometheus metrics based on the checked ports.
    """
    checkport = check_port()
    REGISTRY = CollectorRegistry(auto_describe=False)


    # Define the metric with labels dynamically
    base_labels = ["sertype", "host", "port"]
    dynamic_labels = set()


    # Collect all unique dynamic labels
    for datas in checkport:
        dynamic_labels.update(datas.keys())


    dynamic_labels = dynamic_labels.difference(base_labels)  # Exclude base labels


    # Create a Gauge with all valid labels
    muxStatus = Gauge("server_port_up", "Api response stats is:", base_labels + list(dynamic_labels), registry=REGISTRY)


    for datas in checkport:
        # Extract base label values
        sertype = datas.get("sertype")
        host = datas.get("host")
        port = datas.get("port")
        status = datas.get("status")


        # Prepare label values for dynamic labels
        label_values = [sertype, host, port] + [datas.get(label, "unknown") for label in dynamic_labels]


        muxStatus.labels(*label_values).set(status)


    return Response(prometheus_client.generate_latest(REGISTRY), mimetype="text/plain")




if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

#通過yaml配置端口信息、標簽、協議等等,告警更加直觀。
cat  host_port_conf.yaml
# Prometheus monitor server port config.
pw:
  env: "prod"
  applicationowner: "zhangsan"
  applicationname: "zhangsan"
  vendor: "export"
  techowner: "zhangsan"
  service: "pass"
  host:
    - "10.10.10.123"
    - "10.10.10.124"
  port:
    - 3389
  requirement: "check"  #正常返回通就通,不通就不通
  protocol: "tcp"


ack:
  env: "prod"
  applicationowner: "zhangsan"
  applicationname: "zhangsan"
  vendor: "export"
  techowner: "zhangsan"
  service: "good"
  host:
    - "10.10.10.128"
    - "10.10.10.129"
  port:
    - 1858
  requirement: "any"  # 只需滿足一個就算全通
  protocol: "tcp"
#本地服務測試查看監控特定端口數據返回狀態信息
curl  -s -k http://ip:8080/metrics

③在Cronjob配置

cat prometheus-service-monitor-export-port-deploy.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: service-monitor-export-port
  namespace: monitor
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-service-monitor-export-port
  template:
    metadata:
      labels:
        app: prometheus-service-monitor-export-port
    spec:
      containers:
      - name: service-monitor-export-port
        image: harbor.export.cn/ops/service_status_monitor_export_port:v2
        imagePullPolicy: Always
        command: ["sh","-c"]
        args: ["python  /opt/host_port_monitor.py"]
        resources:
          requests:
            cpu: 500m
            memory: 500Mi
          limits:
            cpu: 4000m
            memory: 4000Mi
        ports:
        - containerPort: 8080
        volumeMounts:
        - name: host-port-conf-volume
          mountPath: /opt/host_port_conf.yaml
          subPath: host_port_conf.yaml
      volumes:
      - name: host-port-conf-volume
        configMap:
          name: host-port-conf
---
apiVersion: v1
kind: Service
metadata:
  name: service-monitor-export-port
  namespace: monitor
spec:
  selector:
    app: prometheus-service-monitor-export-port
  ports:
    - name: http
      port: 80
      targetPort: 8080
  type: ClusterIP
cat prometheus-service-monitor-export-port-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: host-port-conf
  namespace: monitor
data:
  host_port_conf.yaml: |
    # Prometheus monitor server port config.
    pw:
      env: "prod"
      applicationowner: "zhangsan"
      applicationname: "zhangsan"
      vendor: "export"
      techowner: "zhangsan"
      service: "pass"
      host:
        - "10.10.10.123"
        - "10.10.10.124"
      port:
        - 3389
      requirement: "check"  #正常返回通就通,不通就不通
      protocol: "tcp"
    
    crm:
      env: "prod"
      applicationowner: "wangwu"
      applicationname: "wangwu"
      vendor: "export"
      techowner: "wangwu"
      service: "good"
      host:
        - "10.10.10.128"
        - "10.10.10.129"
      port:
        - 1858
      requirement: "any"  # 只需滿足一個就算全通
      protocol: "tcp"
#推送到ack運行該服務
kubectl apply  -f  prometheus-service-monitor-export-port-deploy.yaml
kubectl apply  -f  prometheus-service-monitor-export-port-configmap.yaml
#配置自定義服務發現
- job_name: service-status-monitor-export-port
  scrape_interval: 30s
  scrape_timeout: 30s
  scheme: http
  metrics_path: /metrics
  static_configs:
  - targets: ['service-monitor-export-port.monitor.svc:80']
  
  
#端口PromQL語句 
sum by( host, port, sertype, env, applicationname,applicationowner,techowner,vendor,service,status ) (server_port_up{instance="service-monitor-export-port.monitor.svc:80"}) !=1

④進程配置

圖片圖片

Windows進程狀態查詢語句如下:

windows_service_start_mode{instanceIp=~"10.10.10.76|10.10.10.252",start_mode="auto",name=~"mysqld|redis"}

⑤防火墻配置端口策略

如果Windows出現主機http://10.10.10.22:9400/metrics 為down的,查看Windows的防火墻是否允許9400端口配置,打開cmd命令輸入WF.msc進入Windows Defender 防火墻。

圖片圖片

Linux平臺防火墻配置

#列出系統中的 iptables 規則,同時顯示規則的行號
iptables -L --line -n


#INPUT和OUTPUT規則允許9400
iptables -I INPUT 1 -p tcp --dport 9400 -m comment --comment "Allow arms prometheus - 9400 TCP" -j ACCEPT
iptables -I OUTPUT 2 -p tcp --dport 9400 -m comment --comment "Allow arms prometheus - 9400 TCP" -j ACCEPT


#防火墻保存配置
service iptables save

傳統方式進程、端口配置

①Linux平臺配置

可以使用阿里云云助手或者是ansible推送腳本執行任務;

#創建腳本執行目錄和日志存放目錄
mkdir -p  /opt/prot-monitor
mkdir -p  /var/log/prot-monitor

監控特定進程與端口舉例進程sshd,端口22;

cat port-monitor.sh
#!/bin/bash


# 設置要監控的服務
declare -A MONITORS


# 格式: MONITORS["服務名稱"]="進程1,進程2:端口1,端口2"
MONITORS["ssh"]="sshd:"  # 監控進程
MONITORS["cyberark"]=":22"          # 監控端口


# 日志文件路徑
LOG_DIR="/var/log/prot-monitor"
DATE=$(date +"%Y-%m-%d")
PORT_LOG="${LOG_DIR}/port-${DATE}.log"
PROCESS_LOG="${LOG_DIR}/process-${DATE}.log"


# 當前時間
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")


# 獲取主機名和IP地址
HOSTNAME=$(hostname)
IP_ADDRESS=$(hostname -I | awk '{print $1}')  # 取第一個IP地址


# 清理過期日志(超過30天的日志)
find "$LOG_DIR" -name "*.log" -type f -mtime +30 -exec rm -f {} \;


# 檢查端口狀態函數
check_port_status() {
    local port=$1
    local port_status="UP"


    # 檢查端口狀態
    local port_list_output=$(netstat -tunlp | grep -v '@pts\|cpus\|master' | awk '{sub(/.*:/,"",$4);sub(/[0-9]*\//,"",$7);print $4}' | sort -n | uniq | egrep -w "$port")


    if [[ -z "$port_list_output" ]]; then
        port_status="DOWN"
    fi
    echo $port_status
}


# 檢查進程狀態函數
check_process_status() {
    local process=$1
    local process_status="UP"
    if ! ps -aux | grep -v grep | grep -q "$process"; then
        process_status="DOWN"
    fi
    echo $process_status
}


# 檢查端口或進程狀態
for SERVICE in "${!MONITORS[@]}"; do
    IFS=':' read -r PROCESSES PORTS <<< "${MONITORS[$SERVICE]}"
    
    # 檢查進程狀態
    if [[ -n $PROCESSES ]]; then
        IFS=',' read -r -a PROCESS_ARRAY <<< "$PROCESSES"
        for PROCESS in "${PROCESS_ARRAY[@]}"; do
            PROCESS_STATUS=$(check_process_status "$PROCESS")
            echo "{\"timestamp\": \"$TIMESTAMP\", \"hostname\": \"$HOSTNAME\", \"ip_address\": \"$IP_ADDRESS\", \"service\": \"$SERVICE\", \"process\": \"$PROCESS\", \"process_status\": \"$PROCESS_STATUS\"}" >> "$PROCESS_LOG"
        done
    fi
    
    # 檢查端口狀態
    if [[ -n $PORTS ]]; then
        IFS=',' read -r -a PORT_ARRAY <<< "$PORTS"
        for PORT in "${PORT_ARRAY[@]}"; do
            PORT_STATUS=$(check_port_status "$PORT")
            echo "{\"timestamp\": \"$TIMESTAMP\", \"hostname\": \"$HOSTNAME\", \"ip_address\": \"$IP_ADDRESS\", \"service\": \"$SERVICE\", \"port_status\": \"$PORT_STATUS\", \"port\": \"$PORT\"}" >> "$PORT_LOG"
        done
    fi
done

Crontab定時執行任務計劃

#每分鐘執行腳本port-monitor.sh
crontab -l
*/1 * * * * /bin/bash /opt/prot-monitor/port-monitor.sh


#執行查看日志輸出
bash /opt/prot-monitor/port-monitor.sh


#端口日志json輸出
cat  /var/log/prot-monitor/port-2024-11-18.log
{"timestamp": "2024-11-18 13:51:01", "hostname": "iZufXXXXXXXXXXXXXXX4Z", "ip_address": "10.10.10.80", "service": "sshd", "port_status": "UP", "port": "22"}


#進程日志json輸出
cat  /var/log/prot-monitor/process-2024-11-18.log 
{"timestamp": "2024-11-18 13:50:01", "hostname": "iZufxxxxxxxxxxxxxx4Z", "ip_address": "10.10.10.80", "service": "sshd", "process": "sshd", "process_status": "UP"}


#查看下進程和端口日志服務返回UP狀態信息


#啟動crond和查看crond狀態
systemctl start crond
systemctl status crond


#查看crond服務是否開啟啟動
systemctl list-unit-files -t service  |  grep cron

阿里云SLS對接服務器日志路徑采集日志;

#服務器安裝logtail并運行服務
wget http://logtail-release-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/linux64/logtail.sh -O logtail.sh; chmod +x logtail.sh


#運行logtail服務
./logtail.sh install auto
sudo /etc/init.d/ilogtaild start
sudo /etc/init.d/ilogtaild status 


#查看服務開機啟動
systemctl list-unit-files -t service | grep   ilo

②Windows平臺配置

監控特定進程與端口舉例進程PM.exe,CA.exe,rdpclip.exe,端口3389

#創建目錄
md C:\monitor-logs
cat C:\monitor-logs\port-monitor.ps1
# 設置要監控的服務
$MONITORS = @{
    "cyberark" = "PM.exe,CA.exe,rdpclip.exe:"  # 監控進程
    "rdp" = ":3389"                        # 監控端口
}


# 日志文件路徑
$LOG_DIR = "C:\monitor-logs"
if (!(Test-Path $LOG_DIR)) {
    New-Item -Path $LOG_DIR -ItemType Directory | Out-Null
}


$DATE = Get-Date -Format "yyyy-MM-dd"
$PORT_LOG = Join-Path $LOG_DIR "port-$DATE.log"
$PROCESS_LOG = Join-Path $LOG_DIR "process-$DATE.log"


# 當前時間
$TIMESTAMP = Get-Date -Format "yyyy-MM-dd HH:mm:ss"


# 獲取主機名和IP地址
$HOSTNAME = $env:COMPUTERNAME
$IP_ADDRESS = (Get-NetIPAddress | Where-Object { $_.AddressFamily -eq 'IPv4' -and $_.InterfaceAlias -ne 'Loopback Pseudo-Interface 1' }).IPAddress


# 清理過期日志(超過30天的日志)
Get-ChildItem -Path $LOG_DIR -Filter "*.log" | Where-Object { $_.LastWriteTime -lt (Get-Date).AddDays(-30) } | Remove-Item


# 檢查端口狀態函數
function Check-PortStatus {
    param (
        [int]$port
    )
    $port_status = "UP"
    if (-not (Get-NetTCPConnection -LocalPort $port -ErrorAction SilentlyContinue)) {
        $port_status = "DOWN"
    }
    return $port_status
}


# 使用 tasklist 和 findstr 查詢進程狀態
function Check-ProcessStatus {
    param (
        [string]$processNameToCheck
    )


    Write-Host "Checking process: $processNameToCheck"
    $process_status = "UP"


    # 運行 tasklist 并使用 findstr 來查找進程
    $tasklist_command = "tasklist /FO CSV /NH | findstr /I `"$processNameToCheck`""
    $process_found = Invoke-Expression $tasklist_command


    if (-not $process_found) {
        Write-Host "Process $processNameToCheck not found."
        $process_status = "DOWN"
    } else {
        Write-Host "Process found: $processNameToCheck"
    }


    return $process_status
}


# 檢查端口或進程狀態
foreach ($SERVICE in $MONITORS.Keys) {
    $entry = $MONITORS[$SERVICE]
    $parts = $entry -split ':'
    $processes = $parts[0]
    $ports = $parts[1]


    # 檢查進程狀態
    if ($processes) {
        $process_array = $processes -split ',' | Where-Object { $_ -ne "" }
        foreach ($process in $process_array) {
            $process_status = Check-ProcessStatus -processNameToCheck $process
            $log_entry = @{
                timestamp = $TIMESTAMP
                hostname = $HOSTNAME
                ip_address = $IP_ADDRESS
                service = $SERVICE
                process = $process
                process_status = $process_status
            }
            ($log_entry | ConvertTo-Json -Compress) | Out-File -Append -FilePath $PROCESS_LOG
        }
    }


    # 檢查端口狀態
    if ($ports) {
        $port_array = $ports -split ',' | Where-Object { $_ -ne "" }
        foreach ($port in $port_array) {
            $port_status = Check-PortStatus -port $port
            $log_entry = @{
                timestamp = $TIMESTAMP
                hostname = $HOSTNAME
                ip_address = $IP_ADDRESS
                service = $SERVICE
                port_status = $port_status
                port = $port
            }
            ($log_entry | ConvertTo-Json -Compress) | Out-File -Append -FilePath $PORT_LOG
        }
    }
}
#查詢特定進程信息
tasklist /FO CSV /NH | findstr /I "PM.exe"
tasklist /FO CSV /NH | findstr /I "CA.exe"

③計劃任務

cmd打開命令taskschd.msc打開定時任務;

進程和端口監控告警每1分鐘運行;

圖片圖片

圖片圖片

圖片圖片

圖片圖片

圖片圖片

powershell -WindowStyle Hidden -File "C:\monitor-logs\port-monitor.ps1"

圖片圖片

④SLS進程與端口告警配置

#端口查詢語句
*| select  timestamp,ip_address,hostname,service,port_status,port from log  where port_status is not null  and port  is not null    ORDER BY timestamp DESC  limit 1000




#進程查詢語句
*| select  timestamp,ip_address,hostname,service,process_status from log  where process_status is not null      ORDER BY timestamp DESC  limit 1000

⑤Grafana大屏展示

圖片圖片

內網域名狀態檢查

①Dockerfile配置

FROM python
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
# 指定工作目錄不存會自己創建
RUN pip install pyyaml  --upgrade -i  https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install requests --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
COPY domain_return_code.py   /opt/


ENV TZ=Asia/Shanghai
# 配置環境變量信息
CMD ["sleep","999"]

②檢查腳本調式

#!/usr/bin/env pyhon
# 獲取站點返回狀態碼


import requests
import yaml
import json
import ssl
import socket
import datetime


def get_cert_expiration_date(host, port):
    try:
        host = host.split("http://")[1].split("/")[0]
        context = ssl.create_default_context()
        with socket.create_connection((host, port)) as sock:
            with context.wrap_socket(sock, server_hostname=host) as sslsock:
                cert = sslsock.getpeercert()
                expiration_date = ssl.cert_time_to_seconds(cert['notAfter'])
                return expiration_date
    except Exception as e:
        #print("獲取證書過期時間失敗:", e)
        #證書不正確返回一個2008-08-08時間
        return 1218168000


def get_status_code(url,timeout):
    try:
        requests.packages.urllib3.disable_warnings(requests.packages.urllib3.exceptions.InsecureRequestWarning)
        response = requests.get(url,timeout=timeout,verify=False)
        response_time = response.elapsed.total_seconds()
        #獲取ssl到期時間
        cert_expiration_date = get_cert_expiration_date(url, 443)
        #時間戳轉換成年月日
        dateArray = datetime.datetime.fromtimestamp(cert_expiration_date, datetime.timezone.utc)
        cert_expiration_date_format = dateArray.strftime("%Y-%m-%d")
        return response.status_code, response_time, cert_expiration_date_format
    except requests.exceptions.RequestException as e:
        # print("請求發生錯誤:", e)
        return None


def get_domain_returncode(url,timeout):
    status_code = get_status_code(url,timeout=timeout)
    if status_code is not None:
        # print("狀態碼:", status_code)
        return status_code[0], status_code[1], status_code[2]
    else:
        # print(url + ": " + "504")
        return 504,100,'2008-08-08'




def read_yaml(file_path):
    with open(file_path, 'r') as file:
        try:
            yaml_data = yaml.safe_load(file)
            return yaml_data
        except yaml.YAMLError as e:
            print("讀取YAML文件時發生錯誤:", e)
            return None




if __name__ == "__main__":
    file_path = "/opt/domain_returncode_conf.yaml"
    domain_status_dict = {}
    reade_config = read_yaml(file_path)
    config = reade_config["domain"]
    for i in config:
        code = get_domain_returncode(i["Name"],timeout=i["timeout"])
        #print('code--------')
        #print(code)
        # print(i)
        for k,v in i.items():
            domain_status_dict[k] = v
            domain_status_dict["return_code"] = code[0]
            domain_status_dict["response_code"] = code[1]
            domain_status_dict["EndDate"] = code[2]
        json_str = json.dumps(domain_status_dict)
        domain_status_dict = {}
        print(json_str)
cat  /opt/domain_returncode_conf.yaml
domain:
  - Name: https://admin.export.cn
    timeout: 15
    Network: intranet
  - Name: https://api.export.cn
    timeout: 15
    Network: intranet

③Cronjob配置服務

cat aliyun_domain_code.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: aliyun-domaincode-monitor-server
  namespace: monitor
spec:
  schedule: "*/1  * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      parallelism: 1
      completions: 1
      backoffLimit: 3
      activeDeadlineSeconds: 60
      ttlSecondsAfterFinished: 600
      template:
        spec:
          volumes:
          - name: domain-returncode-config
            configMap:
              name: domain-returncode
          containers:
          - name: aliyun-domaincode-monitor
            image: harbor.export.cn/ops/domain-return-code:v1
            #imagePullSecrets:
            imagePullPolicy: Always
            command:
            - /bin/sh
            - -c
            - python3 /opt/domain_return_code.py
            volumeMounts:
            - name: domain-returncode-config
              mountPath: /opt/domain_returncode_conf.yaml
              subPath: domain_returncode_conf.yaml
          restartPolicy: OnFailure
  startingDeadlineSeconds: 300


---
apiVersion: v1
kind: ConfigMap
metadata:
  name: domain-returncode
  namespace: monitoring
data:
  domain_returncode_conf.yaml: |
    domain:
      - Name: https://admin.export.cn
        timeout: 15
        Network: internet+intranet
        environment: prod
        vendor: zhangsan
        application_name: admin
        application_owner: zhangsan
        tech_owner: zhangsan
        Assignment_group: admin Operations


      - Name: https://api.test.cn
        timeout: 15
        Network: internet
        environment: prod
        vendor: wangwu
        application_name: api
        application_owner: test
        tech_owner: test
        Assignment_group: api Support

④日志輸出返回信息與SLS SQL查詢;

#輸出的日志信息
{"Name": "https://admin.export.cn", "return_code": 200, "response_code": 5.087571, "EndDate": "2025-01-21", "timeout": 15, "Network": "internet+intranet", "environment": "prod", "vendor": "zhangsan", "application_name": "admin", "application_owner": "zhangsan", "tech_owner": "zhangsan", "Assignment_group": "admin Operations"}

SLS查詢語句配置告警;

#查詢證書域名證書時間
(EndDate: * and _namespace_ : monitoring and _container_name_: aliyun-domaincode-monitor)| select DISTINCT Name,return_code,EndDate,date_diff('day', date_parse(split(_time_, 'T')[1], '%Y-%m-%d'), date_parse(EndDate, '%Y-%m-%d')) as days having days > 0 AND days <= 60 ORDER BY days ASC LIMIT 1000


#查詢5XX的狀態碼域名返回信息
(((return_code : 5?? ))) not name:"https://view.export.cn" | SELECT DISTINCT Name,return_code,content from log LIMIT 10000

總結來說,公司A通過采用Prometheus作為監控工具,成功地實現了對Windows和Linux平臺上端口、進程和內網域名狀態的監控。通過精心設計的流程,包括工具選擇與部署、配置端口和進程監控、告警機制的建立、以及數據可視化和優化,公司能夠確保其IT基礎設施的穩定性和安全性。此外,通過使用Grafana進行數據可視化和配置告警機制,公司能夠及時通知監控與開發相關技術人員,從而提高了系統的可用性和響應速度。通過這些措施,公司A能夠有效地管理和維護其業務關鍵服務,支持業務的持續增長和擴展。


責任編輯:武曉燕 來源: 新鈦云服
相關推薦

2013-04-22 10:07:08

2014-06-05 11:39:37

傳統IT云計算

2009-12-01 18:43:57

2010-09-02 09:52:52

2016-05-11 10:31:33

SDN傳統網絡

2024-01-04 08:00:22

時序數據庫項目

2013-12-30 09:19:52

2019-09-06 08:33:25

DNS域名服務器

2011-03-16 15:34:44

2012-04-05 10:42:08

智能布線傳統布線網絡布線

2013-04-23 11:02:39

配線架智能配線架綜合布線

2009-12-07 16:16:45

Windows 7磁盤檢查

2009-09-16 08:43:51

linux進程線程

2011-04-20 17:00:56

Linux終端進程

2011-03-16 10:43:36

2011-05-04 13:17:48

2013-10-17 10:45:33

網絡接入有線寬帶

2017-09-25 15:56:59

UPS數據中心直流電源

2023-03-03 00:03:07

Linux進程管理

2021-05-12 23:16:17

區塊鏈游戲IT
點贊
收藏

51CTO技術棧公眾號

主站蜘蛛池模板: 国产精品久久久久久久午夜 | 久久国产精品免费一区二区三区 | 国产精品久久久久久吹潮日韩动画 | 精品久久影院 | 久久一区二区三区免费 | 羞羞的视频免费观看 | 午夜精品久久 | 亚洲精品国产电影 | 在线观看 亚洲 | 91精品国产一区二区 | 天天综合网7799精品 | 国产成人久久精品 | 福利社午夜影院 | 午夜码电影 | 欧美一区二区久久 | 久久精品欧美一区二区三区麻豆 | 久久国产精品-久久精品 | 久久久久久久久久久久久久av | 一区二区视频在线 | 国产日产精品一区二区三区四区 | 国产a级黄色录像 | 国产精品视频久久久 | 日韩在线播放一区 | 国产成人精品一区二 | 日韩欧美国产电影 | 久久亚洲一区二区三区四区 | 婷婷精品 | 在线观看亚洲精品 | 欧美黄色精品 | 亚洲成人免费观看 | 日日骚网 | 日韩精品在线视频 | 91国内视频在线 | 欧美一级久久 | 成人免费一区二区三区视频网站 | 日韩在线免费看 | www.婷婷 | 日韩精品 | 337p日本欧洲亚洲大胆鲁鲁 | 一道本不卡视频 | 中日韩欧美一级片 |