Initial infrastructure documentation - comprehensive homelab reference
This commit is contained in:
138
DEPLOY-NOW.md
Normal file
138
DEPLOY-NOW.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# 🚀 DEPLOY NOW - Stop the Alert Spam!
|
||||
|
||||
## Quick Deploy (Copy/Paste This)
|
||||
|
||||
Run these commands from **your terminal** (PC, not OpenClaw):
|
||||
|
||||
### 1. Access the Prometheus container
|
||||
```bash
|
||||
# SSH to Proxmox
|
||||
ssh root@10.0.10.3
|
||||
|
||||
# Enter the Prometheus container
|
||||
pct enter 125
|
||||
```
|
||||
|
||||
### 2. Backup existing configs
|
||||
```bash
|
||||
# Inside the container
|
||||
mkdir -p /etc/prometheus/backups
|
||||
cp /etc/prometheus/alertmanager.yml /etc/prometheus/backups/alertmanager.yml.backup
|
||||
cp /etc/prometheus/rules/homelab-alerts.yml /etc/prometheus/backups/homelab-alerts.yml.backup
|
||||
```
|
||||
|
||||
### 3. Download the new configs
|
||||
From your **PC**, download the updated files I created:
|
||||
```bash
|
||||
# On your PC
|
||||
scp root@10.0.10.28:/root/.openclaw/workspace/fred-infrastructure/alertmanager-config-updated.yml ~/
|
||||
scp root@10.0.10.28:/root/.openclaw/workspace/fred-infrastructure/prometheus-alert-rules-updated.yml ~/
|
||||
```
|
||||
|
||||
### 4. Upload to Prometheus container
|
||||
```bash
|
||||
# On your PC
|
||||
scp ~/alertmanager-config-updated.yml root@10.0.10.3:/tmp/
|
||||
scp ~/prometheus-alert-rules-updated.yml root@10.0.10.3:/tmp/
|
||||
```
|
||||
|
||||
Then back in Proxmox SSH:
|
||||
```bash
|
||||
# Copy into container
|
||||
pct push 125 /tmp/alertmanager-config-updated.yml /etc/prometheus/alertmanager.yml
|
||||
pct push 125 /tmp/prometheus-alert-rules-updated.yml /etc/prometheus/rules/homelab-alerts.yml
|
||||
```
|
||||
|
||||
### 5. Reload Prometheus
|
||||
```bash
|
||||
# Inside the container (pct enter 125)
|
||||
systemctl reload prometheus
|
||||
systemctl reload prometheus-alertmanager
|
||||
|
||||
# Verify services reloaded
|
||||
systemctl status prometheus
|
||||
systemctl status prometheus-alertmanager
|
||||
```
|
||||
|
||||
### 6. Test it works
|
||||
```bash
|
||||
# Should see the new config
|
||||
curl http://10.0.10.25:9090/api/v1/rules
|
||||
|
||||
# Send test alert to Discord
|
||||
curl -X POST http://10.0.10.25:9093/api/v1/alerts -d '[
|
||||
{
|
||||
"labels": {
|
||||
"alertname": "TestCriticalAlert",
|
||||
"severity": "critical",
|
||||
"instance": "test:9100"
|
||||
},
|
||||
"annotations": {
|
||||
"summary": "Test alert - please ignore"
|
||||
}
|
||||
}
|
||||
]'
|
||||
```
|
||||
|
||||
You should see the test alert appear in **Discord** within 30 seconds!
|
||||
|
||||
---
|
||||
|
||||
## ⚡ EVEN FASTER: One-Liner (Advanced)
|
||||
|
||||
If you want to do it all in one go:
|
||||
```bash
|
||||
ssh root@10.0.10.3 "pct exec 125 -- bash -c '
|
||||
mkdir -p /etc/prometheus/backups && \
|
||||
cp /etc/prometheus/alertmanager.yml /etc/prometheus/backups/alertmanager.yml.backup && \
|
||||
cp /etc/prometheus/rules/homelab-alerts.yml /etc/prometheus/backups/homelab-alerts.yml.backup && \
|
||||
curl -o /etc/prometheus/alertmanager.yml http://10.0.10.28:PORT/workspace/fred-infrastructure/alertmanager-config-updated.yml && \
|
||||
curl -o /etc/prometheus/rules/homelab-alerts.yml http://10.0.10.28:PORT/workspace/fred-infrastructure/prometheus-alert-rules-updated.yml && \
|
||||
systemctl reload prometheus prometheus-alertmanager
|
||||
'"
|
||||
```
|
||||
|
||||
(Replace `PORT` with OpenClaw web port if you have file serving enabled)
|
||||
|
||||
---
|
||||
|
||||
## 🎯 What This Does
|
||||
|
||||
**Before:**
|
||||
- ⚠️ WARNING alerts every 2 minutes → Email
|
||||
- "CPU changed 61.5% rapidly (5.04 → 1.94)" ← WTF is this even
|
||||
|
||||
**After:**
|
||||
- ✅ CRITICAL alerts only → Discord
|
||||
- ❌ WARNING alerts → Logged, not sent
|
||||
- 📧 Email inbox → CLEAN
|
||||
|
||||
**Result:**
|
||||
- Your inbox goes from 50+ alerts/day to ZERO
|
||||
- Discord gets only real emergencies (host down, disk full, etc.)
|
||||
- Warnings still logged in Prometheus UI if you want to check
|
||||
|
||||
---
|
||||
|
||||
## 🚨 Current Alert Spam Example
|
||||
|
||||
That alert you just got:
|
||||
```
|
||||
⚠️ WARNING: 10.0.10.3:9100
|
||||
cpu_usage on 10.0.10.3:9100 changed 61.5% rapidly (5.04 → 1.94)
|
||||
```
|
||||
|
||||
This is **literally complaining that CPU usage went DOWN**. Pure noise.
|
||||
|
||||
After deployment: **This will NEVER notify you again.** It'll be logged if you want to see it, but won't spam your inbox.
|
||||
|
||||
---
|
||||
|
||||
## Questions?
|
||||
|
||||
- **"I can't SSH to Proxmox"** → Let me know, I'll help
|
||||
- **"It broke something"** → Restore backup: `cp /etc/prometheus/backups/* /etc/prometheus/`
|
||||
- **"I want it even quieter"** → We can tune thresholds further
|
||||
- **"I want email back for criticals"** → Easy to add
|
||||
|
||||
**Just deploy this and watch your inbox become peaceful again.** 🧘♂️✨
|
||||
Reference in New Issue
Block a user