Initial commit: Telegram Management System
Some checks failed
Deploy / deploy (push) Has been cancelled

Full-stack web application for Telegram management
- Frontend: Vue 3 + Vben Admin
- Backend: NestJS
- Features: User management, group broadcast, statistics

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
你的用户名
2025-11-04 15:37:50 +08:00
commit 237c7802e5
3674 changed files with 525172 additions and 0 deletions

940
OPERATIONS.md Normal file
View File

@@ -0,0 +1,940 @@
# Telegram Management System - 运维操作手册
本手册提供了Telegram Management System日常运维操作的详细指导包括常见操作、故障处理、性能调优和安全管理。
## 目录
- [日常运维操作](#日常运维操作)
- [系统监控](#系统监控)
- [故障诊断与处理](#故障诊断与处理)
- [性能调优](#性能调优)
- [安全管理](#安全管理)
- [备份与恢复](#备份与恢复)
- [版本更新](#版本更新)
- [应急响应](#应急响应)
## 日常运维操作
### 服务状态检查
**检查应用服务状态**:
```bash
# PM2服务状态
pm2 status
pm2 monit
# 检查进程
ps aux | grep node
ps aux | grep telegram-management
# 检查端口监听
netstat -tlnp | grep :3000
ss -tlnp | grep :3000
# 检查服务响应
curl -I http://localhost:3000/health
curl -s http://localhost:3000/health/detailed | jq .
```
**检查数据库状态**:
```bash
# MySQL服务状态
sudo systemctl status mysql
mysqladmin -u root -p status
mysqladmin -u root -p processlist
# 连接数检查
mysql -u root -p -e "SHOW STATUS LIKE 'Threads_connected';"
mysql -u root -p -e "SHOW STATUS LIKE 'Max_used_connections';"
# 慢查询检查
mysql -u root -p -e "SHOW STATUS LIKE 'Slow_queries';"
```
**检查Redis状态**:
```bash
# Redis服务状态
sudo systemctl status redis
redis-cli ping
# Redis信息
redis-cli info server
redis-cli info memory
redis-cli info stats
# 连接数检查
redis-cli info clients
```
### 日志管理
**应用日志查看**:
```bash
# PM2日志
pm2 logs telegram-management-backend
pm2 logs telegram-management-backend --lines 100
# 应用日志文件
tail -f backend/logs/app.log
tail -f backend/logs/error.log
tail -f backend/logs/access.log
# 筛选错误日志
grep -i error backend/logs/app.log
grep -i "500\|error\|exception" backend/logs/access.log
```
**系统日志查看**:
```bash
# 系统日志
sudo journalctl -u telegram-management-backend -f
sudo journalctl -u mysql -f
sudo journalctl -u redis -f
# Nginx日志
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log
```
**日志轮转管理**:
```bash
# 手动轮转日志
sudo logrotate -f /etc/logrotate.d/telegram-management
# 检查日志轮转状态
sudo logrotate -d /etc/logrotate.d/telegram-management
# 清理旧日志
find backend/logs -name "*.log.*" -mtime +30 -delete
```
### 磁盘空间管理
**磁盘使用检查**:
```bash
# 磁盘使用情况
df -h
du -sh /var/www/telegram-management/*
# 查找大文件
find /var/www/telegram-management -type f -size +100M -exec ls -lh {} \;
# 分析目录大小
du -h --max-depth=1 /var/www/telegram-management/
```
**清理临时文件**:
```bash
# 清理应用临时文件
rm -rf backend/tmp/*
rm -rf backend/sessions/tmp_*
# 清理系统临时文件
sudo rm -rf /tmp/telegram-*
sudo rm -rf /var/tmp/telegram-*
# 清理npm缓存
npm cache clean --force
```
### 数据库维护
**日常维护操作**:
```bash
# 数据库优化
mysql -u root -p -e "OPTIMIZE TABLE telegram_management.group_tasks;"
mysql -u root -p -e "OPTIMIZE TABLE telegram_management.tg_account_pool;"
mysql -u root -p -e "OPTIMIZE TABLE telegram_management.risk_logs;"
# 分析表统计信息
mysql -u root -p -e "ANALYZE TABLE telegram_management.group_tasks;"
# 检查表状态
mysql -u root -p -e "CHECK TABLE telegram_management.group_tasks;"
# 修复表(如需要)
mysql -u root -p -e "REPAIR TABLE telegram_management.group_tasks;"
```
**清理历史数据**:
```sql
-- 清理30天前的风控日志
DELETE FROM risk_logs WHERE createdAt < DATE_SUB(NOW(), INTERVAL 30 DAY);
-- 清理90天前的异常日志
DELETE FROM anomaly_logs WHERE createdAt < DATE_SUB(NOW(), INTERVAL 90 DAY);
-- 清理完成的任务记录保留6个月
DELETE FROM group_tasks
WHERE status = 'completed'
AND completedAt < DATE_SUB(NOW(), INTERVAL 6 MONTH);
-- 优化表空间
OPTIMIZE TABLE risk_logs, anomaly_logs, group_tasks;
```
## 系统监控
### 关键指标监控
**系统资源监控脚本** (`monitor.sh`):
```bash
#!/bin/bash
LOG_FILE="/var/log/telegram-management-monitor.log"
ALERT_THRESHOLD_CPU=80
ALERT_THRESHOLD_MEM=85
ALERT_THRESHOLD_DISK=90
# 获取系统指标
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')
MEM_USAGE=$(free | grep Mem | awk '{printf("%.2f", ($3/$2) * 100.0)}')
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
# 记录指标
echo "$(date '+%Y-%m-%d %H:%M:%S') - CPU: ${CPU_USAGE}%, MEM: ${MEM_USAGE}%, DISK: ${DISK_USAGE}%" >> $LOG_FILE
# 检查告警条件
if (( $(echo "$CPU_USAGE > $ALERT_THRESHOLD_CPU" | bc -l) )); then
echo "ALERT: High CPU usage: ${CPU_USAGE}%" | logger -t telegram-management
fi
if (( $(echo "$MEM_USAGE > $ALERT_THRESHOLD_MEM" | bc -l) )); then
echo "ALERT: High memory usage: ${MEM_USAGE}%" | logger -t telegram-management
fi
if [ "$DISK_USAGE" -gt "$ALERT_THRESHOLD_DISK" ]; then
echo "ALERT: High disk usage: ${DISK_USAGE}%" | logger -t telegram-management
fi
```
**应用性能监控**:
```bash
# HTTP响应时间检查
curl -o /dev/null -s -w "响应时间: %{time_total}s\n" http://localhost:3000/health
# 数据库连接检查
mysql -u tg_user -p -e "SELECT COUNT(*) as active_connections FROM information_schema.processlist;"
# Redis性能检查
redis-cli --latency-history -i 1
# PM2性能监控
pm2 show telegram-management-backend
```
### 自动化监控脚本
**健康检查脚本** (`health-check.sh`):
```bash
#!/bin/bash
SERVICE_NAME="telegram-management-backend"
HEALTH_URL="http://localhost:3000/health"
EMAIL_ALERT="admin@yourdomain.com"
# 检查PM2进程
if ! pm2 list | grep -q "$SERVICE_NAME.*online"; then
echo "服务 $SERVICE_NAME 未运行,尝试重启..."
pm2 restart $SERVICE_NAME
# 等待服务启动
sleep 10
# 再次检查
if ! pm2 list | grep -q "$SERVICE_NAME.*online"; then
echo "服务重启失败,发送告警邮件"
echo "服务 $SERVICE_NAME 重启失败,请立即检查" | mail -s "紧急:服务异常" $EMAIL_ALERT
fi
fi
# 检查HTTP响应
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" $HEALTH_URL)
if [ "$HTTP_CODE" != "200" ]; then
echo "健康检查失败HTTP状态码: $HTTP_CODE"
echo "健康检查失败HTTP状态码: $HTTP_CODE" | mail -s "告警:健康检查失败" $EMAIL_ALERT
fi
# 检查数据库连接
if ! mysql -u tg_user -p$DB_PASSWORD -e "SELECT 1;" &> /dev/null; then
echo "数据库连接失败"
echo "数据库连接失败,请检查数据库服务" | mail -s "告警:数据库连接失败" $EMAIL_ALERT
fi
# 检查Redis连接
if ! redis-cli ping &> /dev/null; then
echo "Redis连接失败"
echo "Redis连接失败请检查Redis服务" | mail -s "告警Redis连接失败" $EMAIL_ALERT
fi
```
**定时任务配置**:
```bash
# 编辑定时任务
crontab -e
# 添加以下内容:
# 每分钟检查系统资源
* * * * * /path/to/monitor.sh
# 每5分钟进行健康检查
*/5 * * * * /path/to/health-check.sh
# 每小时备份重要数据
0 * * * * /path/to/backup.sh
# 每天凌晨清理日志
0 2 * * * /path/to/cleanup-logs.sh
```
## 故障诊断与处理
### 常见故障诊断
**服务无响应**:
```bash
# 1. 检查进程状态
pm2 status
ps aux | grep node
# 2. 检查端口占用
netstat -tlnp | grep :3000
lsof -i :3000
# 3. 检查系统资源
top
free -h
df -h
# 4. 查看错误日志
pm2 logs telegram-management-backend --err
tail -f backend/logs/error.log
# 5. 重启服务
pm2 restart telegram-management-backend
```
**数据库连接问题**:
```bash
# 1. 检查MySQL服务
sudo systemctl status mysql
sudo systemctl restart mysql
# 2. 检查连接数
mysql -u root -p -e "SHOW STATUS LIKE 'Threads_connected';"
mysql -u root -p -e "SHOW VARIABLES LIKE 'max_connections';"
# 3. 检查锁等待
mysql -u root -p -e "SHOW ENGINE INNODB STATUS\G" | grep -A 20 "LATEST DETECTED DEADLOCK"
# 4. 检查慢查询
mysql -u root -p -e "SHOW STATUS LIKE 'Slow_queries';"
tail -f /var/log/mysql/slow.log
```
**内存泄漏诊断**:
```bash
# 1. 生成堆快照
kill -USR2 $(pgrep -f "telegram-management-backend")
# 2. 分析内存使用
node --inspect backend/src/app.js
# 使用Chrome DevTools连接并分析
# 3. 监控内存增长
while true; do
ps -p $(pgrep -f "telegram-management-backend") -o pid,vsz,rss,comm
sleep 60
done
# 4. 重启服务释放内存
pm2 restart telegram-management-backend
```
### 故障处理流程
**故障分级**:
- **P0 (紧急)**: 服务完全不可用
- **P1 (重要)**: 核心功能异常
- **P2 (一般)**: 部分功能异常
- **P3 (轻微)**: 性能问题或警告
**P0故障处理**:
```bash
# 1. 立即评估影响范围
curl -I http://localhost:3000/health
pm2 status
# 2. 快速恢复服务
pm2 restart telegram-management-backend
# 3. 检查关键组件
sudo systemctl status mysql redis nginx
# 4. 如无法快速恢复,启用备用方案
# (根据实际情况,可能需要切换到备用服务器)
# 5. 记录故障信息
echo "$(date): P0故障 - 服务不可用" >> /var/log/incidents.log
```
**性能问题诊断**:
```bash
# 1. CPU性能分析
top -p $(pgrep -f "telegram-management-backend")
perf top -p $(pgrep -f "telegram-management-backend")
# 2. 数据库性能分析
mysql -u root -p -e "SHOW PROCESSLIST;"
mysql -u root -p -e "SHOW ENGINE INNODB STATUS\G"
# 3. Redis性能分析
redis-cli --bigkeys
redis-cli --hotkeys
redis-cli monitor
# 4. 网络性能分析
netstat -i
iftop
```
## 性能调优
### 应用层优化
**Node.js参数调优**:
```bash
# PM2配置优化
pm2 start ecosystem.config.js --node-args="--max-old-space-size=4096 --optimize-for-size"
# 启用V8优化
export NODE_OPTIONS="--max-old-space-size=4096 --optimize-for-size"
```
**连接池优化**:
```javascript
// 数据库连接池配置
const dbConfig = {
pool: {
max: 50, // 最大连接数
min: 10, // 最小连接数
acquire: 30000, // 获取连接超时时间
idle: 10000 // 连接空闲时间
}
};
// Redis连接池配置
const redisConfig = {
family: 4,
keepAlive: true,
lazyConnect: true,
maxRetriesPerRequest: 3,
retryDelayOnFailover: 100,
enableOfflineQueue: false,
maxmemoryPolicy: 'allkeys-lru'
};
```
### 数据库优化
**查询优化**:
```sql
-- 分析慢查询
SELECT * FROM mysql.slow_log WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR);
-- 创建复合索引
CREATE INDEX idx_task_status_created ON group_tasks(status, createdAt);
CREATE INDEX idx_account_health_status ON tg_account_pool(healthScore, status);
-- 分区表优化(针对大表)
ALTER TABLE risk_logs PARTITION BY RANGE (YEAR(createdAt)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026)
);
```
**配置优化**:
```ini
# MySQL配置优化
[mysqld]
# InnoDB设置
innodb_buffer_pool_size = 8G
innodb_log_file_size = 512M
innodb_log_buffer_size = 128M
innodb_flush_log_at_trx_commit = 2
# 查询缓存
query_cache_type = 1
query_cache_size = 512M
query_cache_limit = 32M
# 连接设置
max_connections = 1000
thread_cache_size = 100
# 临时表设置
tmp_table_size = 256M
max_heap_table_size = 256M
```
### 缓存优化
**Redis优化策略**:
```bash
# Redis配置调优
redis-cli CONFIG SET maxmemory-policy allkeys-lru
redis-cli CONFIG SET tcp-keepalive 300
redis-cli CONFIG SET timeout 0
# 缓存预热脚本
redis-cli EVAL "
local keys = redis.call('KEYS', 'cache:account:*')
for i=1,#keys do
redis.call('EXPIRE', keys[i], 3600)
end
return #keys
" 0
```
**应用缓存策略**:
```javascript
// 多级缓存实现
class CacheManager {
constructor() {
this.l1Cache = new Map(); // 内存缓存
this.l2Cache = redis; // Redis缓存
}
async get(key) {
// L1缓存查找
if (this.l1Cache.has(key)) {
return this.l1Cache.get(key);
}
// L2缓存查找
const value = await this.l2Cache.get(key);
if (value) {
this.l1Cache.set(key, JSON.parse(value));
return JSON.parse(value);
}
return null;
}
async set(key, value, ttl = 3600) {
this.l1Cache.set(key, value);
await this.l2Cache.setex(key, ttl, JSON.stringify(value));
}
}
```
## 安全管理
### 访问控制
**用户权限管理**:
```bash
# 创建运维用户
sudo useradd -m -s /bin/bash telegram-ops
sudo usermod -aG sudo telegram-ops
# 设置SSH密钥认证
mkdir -p /home/telegram-ops/.ssh
cat >> /home/telegram-ops/.ssh/authorized_keys << EOF
ssh-rsa YOUR_PUBLIC_KEY telegram-ops@management
EOF
chmod 700 /home/telegram-ops/.ssh
chmod 600 /home/telegram-ops/.ssh/authorized_keys
chown -R telegram-ops:telegram-ops /home/telegram-ops/.ssh
```
**数据库安全**:
```sql
-- 创建只读用户(用于监控)
CREATE USER 'monitor'@'localhost' IDENTIFIED BY 'monitor_password';
GRANT SELECT ON telegram_management.* TO 'monitor'@'localhost';
-- 创建备份用户
CREATE USER 'backup'@'localhost' IDENTIFIED BY 'backup_password';
GRANT SELECT, LOCK TABLES ON telegram_management.* TO 'backup'@'localhost';
-- 定期更新密码
ALTER USER 'tg_user'@'localhost' IDENTIFIED BY 'new_secure_password';
FLUSH PRIVILEGES;
```
### 安全审计
**日志审计脚本** (`security-audit.sh`):
```bash
#!/bin/bash
AUDIT_LOG="/var/log/security-audit.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$DATE] 开始安全审计" >> $AUDIT_LOG
# 检查失败的登录尝试
FAILED_LOGINS=$(grep "Failed password" /var/log/auth.log | wc -l)
echo "[$DATE] 失败登录尝试: $FAILED_LOGINS" >> $AUDIT_LOG
# 检查权限异常文件
find /var/www/telegram-management -type f -perm /o+w >> $AUDIT_LOG
# 检查异常进程
ps aux | grep -v "telegram-management\|mysql\|redis\|nginx" | grep -E "(bash|sh).*root" >> $AUDIT_LOG
# 检查网络连接
netstat -an | grep :3000 | grep ESTABLISHED | wc -l >> $AUDIT_LOG
echo "[$DATE] 安全审计完成" >> $AUDIT_LOG
```
**安全加固检查**:
```bash
# 检查系统更新
sudo apt list --upgradable
# 检查开放端口
nmap -sT -O localhost
# 检查文件完整性
find /var/www/telegram-management -type f -name "*.js" -exec md5sum {} \; > checksums.txt
# 检查SSL证书有效期
openssl x509 -in /path/to/cert.pem -text -noout | grep "Not After"
```
## 备份与恢复
### 自动化备份
**完整备份脚本** (`full-backup.sh`):
```bash
#!/bin/bash
BACKUP_BASE="/backup"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# 创建备份目录
mkdir -p $BACKUP_BASE/{mysql,redis,files,logs}/$DATE
# 数据库备份
mysqldump -u backup -p$BACKUP_PASS --single-transaction --routines --triggers telegram_management > $BACKUP_BASE/mysql/$DATE/full_backup.sql
gzip $BACKUP_BASE/mysql/$DATE/full_backup.sql
# Redis备份
redis-cli --rdb $BACKUP_BASE/redis/$DATE/dump.rdb
# 文件备份
tar -czf $BACKUP_BASE/files/$DATE/application.tar.gz /var/www/telegram-management
tar -czf $BACKUP_BASE/files/$DATE/sessions.tar.gz /var/www/telegram-management/backend/sessions
# 日志备份
tar -czf $BACKUP_BASE/logs/$DATE/logs.tar.gz /var/www/telegram-management/backend/logs
# 生成备份清单
cat > $BACKUP_BASE/manifest_$DATE.txt << EOF
备份时间: $(date)
数据库大小: $(du -h $BACKUP_BASE/mysql/$DATE/full_backup.sql.gz | cut -f1)
Redis大小: $(du -h $BACKUP_BASE/redis/$DATE/dump.rdb | cut -f1)
应用文件大小: $(du -h $BACKUP_BASE/files/$DATE/application.tar.gz | cut -f1)
会话文件大小: $(du -h $BACKUP_BASE/files/$DATE/sessions.tar.gz | cut -f1)
日志文件大小: $(du -h $BACKUP_BASE/logs/$DATE/logs.tar.gz | cut -f1)
EOF
# 清理过期备份
find $BACKUP_BASE -type f -mtime +$RETENTION_DAYS -delete
find $BACKUP_BASE -type d -empty -delete
echo "备份完成: $DATE"
```
### 恢复操作
**数据库恢复**:
```bash
# 完整恢复
mysql -u root -p telegram_management < backup_file.sql
# 部分表恢复
mysql -u root -p telegram_management -e "DROP TABLE IF EXISTS group_tasks;"
mysqldump -u backup -p backup_telegram_management group_tasks | mysql -u root -p telegram_management
# 恢复验证
mysql -u root -p -e "SELECT COUNT(*) FROM telegram_management.group_tasks;"
```
**应用恢复**:
```bash
# 停止服务
pm2 stop telegram-management-backend
# 恢复应用文件
cd /var/www
sudo rm -rf telegram-management
sudo tar -xzf /backup/files/20240101_020000/application.tar.gz
# 恢复会话文件
sudo tar -xzf /backup/files/20240101_020000/sessions.tar.gz -C /var/www/telegram-management/backend/
# 恢复权限
sudo chown -R telegram-ops:telegram-ops /var/www/telegram-management
sudo chmod +x /var/www/telegram-management/backend/src/app.js
# 重启服务
pm2 start ecosystem.config.js --env production
```
### 灾难恢复
**故障转移步骤**:
```bash
# 1. 评估故障影响
curl -I http://primary-server:3000/health
ping primary-server
# 2. 切换DNS解析到备用服务器
# (需要根据DNS提供商操作)
# 3. 在备用服务器上恢复最新备份
./restore-from-backup.sh latest
# 4. 验证服务功能
curl -I http://backup-server:3000/health
./health-check.sh
# 5. 通知相关人员
echo "故障转移完成,当前使用备用服务器" | mail -s "故障转移通知" team@company.com
```
## 版本更新
### 滚动更新流程
**更新脚本** (`rolling-update.sh`):
```bash
#!/bin/bash
NEW_VERSION=$1
BACKUP_DIR="/backup/pre-update-$(date +%Y%m%d)"
if [ -z "$NEW_VERSION" ]; then
echo "使用方法: $0 <版本号>"
exit 1
fi
echo "开始更新到版本: $NEW_VERSION"
# 1. 创建更新前备份
echo "创建更新前备份..."
mkdir -p $BACKUP_DIR
cp -r /var/www/telegram-management $BACKUP_DIR/
# 2. 下载新版本
echo "下载新版本..."
cd /tmp
git clone -b $NEW_VERSION https://github.com/your-org/telegram-management-system.git
cd telegram-management-system
# 3. 检查依赖变化
echo "检查依赖变化..."
diff package.json /var/www/telegram-management/backend/package.json
# 4. 执行数据库迁移(如需要)
echo "执行数据库迁移..."
cd backend
npm run migrate:check
# 5. 构建新版本
echo "构建前端..."
cd ../frontend
npm install
npm run build
# 6. 停止服务
echo "停止服务..."
pm2 stop telegram-management-backend
# 7. 部署新版本
echo "部署新版本..."
cp -r /tmp/telegram-management-system/backend/* /var/www/telegram-management/backend/
cp -r /tmp/telegram-management-system/frontend/dist/* /var/www/telegram-management/frontend/dist/
# 8. 安装新依赖
cd /var/www/telegram-management/backend
npm install --production
# 9. 执行数据库迁移
npm run migrate
# 10. 启动服务
echo "启动服务..."
pm2 start ecosystem.config.js --env production
# 11. 健康检查
sleep 10
if curl -f http://localhost:3000/health; then
echo "更新成功!"
# 清理临时文件
rm -rf /tmp/telegram-management-system
else
echo "更新失败,开始回滚..."
pm2 stop telegram-management-backend
cp -r $BACKUP_DIR/telegram-management/* /var/www/telegram-management/
pm2 start ecosystem.config.js --env production
fi
```
### 回滚操作
**快速回滚脚本** (`rollback.sh`):
```bash
#!/bin/bash
BACKUP_DIR=$1
if [ -z "$BACKUP_DIR" ]; then
echo "使用方法: $0 <备份目录>"
exit 1
fi
echo "开始回滚到: $BACKUP_DIR"
# 停止当前服务
pm2 stop telegram-management-backend
# 恢复备份
cp -r $BACKUP_DIR/telegram-management/* /var/www/telegram-management/
# 恢复数据库(如需要)
if [ -f "$BACKUP_DIR/database.sql" ]; then
mysql -u root -p telegram_management < $BACKUP_DIR/database.sql
fi
# 重启服务
pm2 start ecosystem.config.js --env production
# 验证回滚
sleep 10
if curl -f http://localhost:3000/health; then
echo "回滚成功!"
else
echo "回滚失败,请手动检查!"
fi
```
## 应急响应
### 应急响应流程
**P0级故障响应**:
1. **立即响应** (0-5分钟)
- 确认故障并评估影响范围
- 启动应急响应团队
- 尝试快速恢复操作
2. **缓解措施** (5-15分钟)
- 实施临时解决方案
- 切换到备用系统(如有)
- 通知用户和利益相关者
3. **根因分析** (15分钟-1小时)
- 收集故障相关信息
- 分析根本原因
- 制定修复计划
4. **彻底修复** (1-4小时)
- 实施永久性修复
- 验证修复效果
- 更新监控和告警
5. **事后总结** (24小时内)
- 编写故障报告
- 总结经验教训
- 改进预防措施
### 应急联系信息
**联系清单**:
```bash
# 应急联系人
PRIMARY_ONCALL="张三 <zhangsan@company.com> +86-138-0000-0000"
SECONDARY_ONCALL="李四 <lisi@company.com> +86-138-1111-1111"
MANAGER="王五 <wangwu@company.com> +86-138-2222-2222"
# 外部服务联系方式
CLOUD_PROVIDER_SUPPORT="+86-400-xxx-xxxx"
DNS_PROVIDER_SUPPORT="support@dns-provider.com"
SSL_PROVIDER_SUPPORT="support@ssl-provider.com"
```
### 故障通知模板
**故障通知邮件模板**:
```
主题:[P0故障] Telegram Management System服务异常
故障等级P0 - 紧急
发生时间2024-01-01 14:30:00
影响范围:全部用户
故障现象服务无响应所有API调用失败
当前状态:正在处理中
预计恢复时间15:00:00
已采取措施:
1. 重启应用服务
2. 检查数据库连接
3. 启动备用服务器
后续更新将在30分钟内发送。
运维团队
Telegram Management System
```
---
本运维操作手册提供了Telegram Management System的完整运维指导涵盖了日常操作、监控、故障处理、性能优化、安全管理、备份恢复、版本更新和应急响应等各个方面。请运维团队严格按照手册执行各项操作确保系统稳定运行。