树莓派与香橙派CPU温度监控实战:从命令行到可视化告警系统
在单板计算机的世界里,树莓派和香橙派凭借其出色的性价比和丰富的扩展性,已经成为创客、开发者和极客们的首选工具。无论是作为家庭媒体中心、自动化控制节点还是轻量级服务器,这些小巧的设备往往需要长时间稳定运行。而CPU温度作为系统健康的关键指标,直接影响着设备的性能和寿命。本文将带你从基础命令开始,逐步构建一个完整的温度监控体系,最终实现可视化桌面组件和智能告警功能。
1. 基础温度监控:命令行与脚本
1.1 原生系统命令解析
树莓派和香橙派虽然基于不同的硬件架构,但都遵循Linux系统的标准温度监控接口。最直接的方式是读取系统提供的温度文件:
cat /sys/class/thermal/thermal_zone0/temp这个命令会返回一个以毫摄氏度为单位的整数值。例如输出51234表示当前温度为51.234°C。为了获得更友好的显示,可以使用awk进行格式化:
awk '{printf "当前CPU温度:%.2f°C\n", $1/1000}' /sys/class/thermal/thermal_zone0/temp树莓派用户还可以使用专有的vcgencmd工具,它能提供更精确的温度读数:
vcgencmd measure_temp如果提示命令不存在,需要先安装相关软件包:
sudo apt install libraspberrypi-bin1.2 实时监控脚本编写
要实现动态刷新显示,可以结合watch命令:
watch -n 3 'echo CPU温度:$(vcgencmd measure_temp | cut -c6-11)'这个命令会每3秒更新一次温度显示。对于更复杂的监控需求,可以编写自定义脚本:
#!/bin/bash while true; do clear temp=$(cat /sys/class/thermal/thermal_zone0/temp) load=$(cat /proc/loadavg | awk '{print $1}') echo "CPU温度:$(echo "scale=2; $temp/1000" | bc)°C" echo "系统负载:$load" sleep 2 done这个增强版脚本不仅显示温度,还同时监控系统负载,为性能分析提供更多上下文信息。
2. 高级监控方案:后台服务与自动化
2.1 创建系统守护进程
要让监控脚本在后台持续运行并在开机时自动启动,我们需要将其转换为系统服务。创建一个新的服务文件:
sudo nano /etc/systemd/system/cpu-monitor.service添加以下内容:
[Unit] Description=CPU Temperature Monitor After=network.target [Service] ExecStart=/usr/local/bin/cpu-monitor.sh Restart=always User=pi [Install] WantedBy=multi-user.target然后将之前的脚本保存到/usr/local/bin/cpu-monitor.sh,并赋予执行权限:
sudo chmod +x /usr/local/bin/cpu-monitor.sh启用并启动服务:
sudo systemctl enable cpu-monitor.service sudo systemctl start cpu-monitor.service2.2 温度日志与历史分析
长期监控需要记录历史数据以便分析趋势。使用以下脚本将温度数据记录到CSV文件:
#!/bin/bash LOG_FILE="/var/log/cpu_temp.log" while true; do timestamp=$(date "+%Y-%m-%d %H:%M:%S") temp=$(cat /sys/class/thermal/thermal_zone0/temp | awk '{print $1/1000}') echo "$timestamp,$temp" >> $LOG_FILE sleep 300 done这个脚本每5分钟记录一次温度数据。可以使用gnuplot等工具生成可视化图表:
#!/usr/bin/gnuplot -persist set datafile separator "," set xdata time set timefmt "%Y-%m-%d %H:%M:%S" set format x "%m/%d\n%H:%M" set ylabel "Temperature (°C)" set title "CPU Temperature History" plot "/var/log/cpu_temp.log" using 1:2 with lines title "CPU Temp"3. 可视化方案:桌面小部件配置
3.1 Conky实时监控面板
Conky是一个轻量级的系统监控工具,可以创建高度可定制的桌面小部件。安装Conky:
sudo apt install conky创建配置文件~/.conkyrc:
conky.config = { background = true, update_interval = 2, cpu_avg_samples = 2, net_avg_samples = 2, double_buffer = true, no_buffers = true, text_buffer_size = 2048, gap_x = 20, gap_y = 40, alignment = 'top_right', own_window = true, own_window_type = 'normal', own_window_transparent = true, own_window_hints = 'undecorated,below,sticky,skip_taskbar,skip_pager', border_inner_margin = 5, border_outer_margin = 0, draw_shades = false, draw_outline = false, draw_borders = false, use_xft = true, font = 'DejaVu Sans:size=10', xftalpha = 0.8, uppercase = false, }; conky.text = [[ ${color white}${font DejaVu Sans:bold:size=12}SYSTEM MONITOR${font}${color} ${hr 2} ${color lightgrey}CPU Temp: ${alignr}${execpi 5 vcgencmd measure_temp | cut -c6-11}°C ${color lightgrey}CPU Usage: ${alignr}${cpu}% ${color lightgrey}RAM Usage: ${alignr}${memperc}% ${color lightgrey}Disk Usage: ${alignr}${fs_used_perc /}% ${color lightgrey}Uptime: ${alignr}${uptime_short} ]]启动Conky:
conky -d3.2 GNOME Shell扩展
对于使用GNOME桌面的用户,可以安装gnome-shell-extension-system-monitor:
sudo apt install gnome-shell-extension-system-monitor然后在GNOME Tweaks工具中启用"System Monitor"扩展。要显示CPU温度,需要编辑扩展配置:
dconf write /org/gnome/shell/extensions/system-monitor/cpu-temperature true4. 智能告警系统实现
4.1 温度阈值检测
创建一个Python脚本来检测温度是否超过阈值:
#!/usr/bin/env python3 import os import smtplib from email.mime.text import MIMEText THRESHOLD = 70 # 温度阈值(°C) LOG_FILE = "/var/log/cpu_temp_alert.log" def get_cpu_temp(): with open("/sys/class/thermal/thermal_zone0/temp", "r") as f: temp = int(f.read().strip()) / 1000 return temp def log_alert(temp): with open(LOG_FILE, "a") as f: f.write(f"Alert! CPU temperature reached {temp}°C\n") def send_email(temp): msg = MIMEText(f"警告:CPU温度已达到 {temp}°C") msg["Subject"] = "CPU温度警报" msg["From"] = "alert@example.com" msg["To"] = "admin@example.com" with smtplib.SMTP("smtp.example.com", 587) as server: server.login("user", "password") server.send_message(msg) if __name__ == "__main__": temp = get_cpu_temp() if temp > THRESHOLD: log_alert(temp) send_email(temp)4.2 系统通知集成
对于桌面用户,可以使用notify-send命令创建可视化通知:
#!/bin/bash THRESHOLD=70 TEMP=$(cat /sys/class/thermal/thermal_zone0/temp | awk '{print $1/1000}') if (( $(echo "$TEMP > $THRESHOLD" | bc -l) )); then notify-send -u critical "温度警报" "CPU温度已达到 ${TEMP}°C" fi将这个脚本加入cron定时任务,每分钟检查一次:
(crontab -l ; echo "* * * * * /path/to/temp_check.sh") | crontab -4.3 自动化降温措施
当温度持续过高时,可以自动采取措施降低CPU负载:
#!/usr/bin/env python3 import time import subprocess from datetime import datetime MAX_TEMP = 75 COOLDOWN_TEMP = 65 CHECK_INTERVAL = 30 # 秒 def get_temp(): with open("/sys/class/thermal/thermal_zone0/temp", "r") as f: return int(f.read().strip()) / 1000 def throttle_cpu(enable): cmd = "vcgencmd get_throttled" if enable: cmd = "echo 'performance' | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor" subprocess.run(cmd, shell=True) while True: temp = get_temp() now = datetime.now().strftime("%Y-%m-%d %H:%M:%S") if temp > MAX_TEMP: print(f"{now} - 温度过高: {temp}°C, 启用节流") throttle_cpu(True) elif temp < COOLDOWN_TEMP: print(f"{now} - 温度正常: {temp}°C, 解除节流") throttle_cpu(False) time.sleep(CHECK_INTERVAL)