# 📊 MONITORING SETUP GUIDE
**PT. Sarana Gemilang Finance System**

Sistem sudah punya 2 endpoint health check yang siap untuk monitoring:
- `GET /health` — basic check, response cepat
- `GET /health/detailed` — comprehensive (DB latency, memory, counters)

Pilih salah satu service di bawah ini. **UptimeRobot direkomendasikan** karena free tier cukup untuk kebutuhan sistem ini.

---

## OPSI 1: UPTIMEROBOT (Gratis, Rekomendasi) ⭐

UptimeRobot memantau dari multiple lokasi global tiap 5 menit. Free tier: 50 monitor.

### Setup (10 menit)

1. **Daftar akun**: https://uptimerobot.com/signUp
2. **Verifikasi email** lalu login.
3. **Add New Monitor**:
   - Monitor Type: **HTTP(s)**
   - Friendly Name: `Sarana Gemilang API - Basic`
   - URL: `https://YOUR-DOMAIN.com/health`
   - Monitoring Interval: **5 minutes**
   - Monitor Timeout: **30 seconds**
   - HTTP Method: GET
   - Expected Status Codes: `200`
4. **Save changes**.
5. **Add monitor kedua** untuk detailed:
   - Friendly Name: `Sarana Gemilang API - Detailed`
   - URL: `https://YOUR-DOMAIN.com/health/detailed`
   - Monitoring Interval: **5 minutes**
   - HTTP Method: GET
   - Expected Status Codes: `200` (akan return 503 jika down)
   - **Keyword Monitoring**: Aktifkan, set keyword: `"status":"ok"`
6. **Setup Alert Contacts**:
   - Settings → Alert Contacts → Add Alert Contact
   - Type: **Email** (atau **Telegram** untuk notifikasi instant)
   - Email: alamat email tim IT
   - Untuk Telegram: ikuti panduan di-app (chat bot @UptimeRobotBot)
7. **Subscribe alert contact** ke kedua monitor yang dibuat.

### Hasil
- Jika `/health` return non-200 → email alert
- Jika `/health/detailed` tidak mengandung `"status":"ok"` (artinya degraded/down) → email alert
- Public status page otomatis di `https://stats.uptimerobot.com/YOUR-ID` (opsional bisa di-share ke client)

---

## OPSI 2: BETTER STACK (Gratis 10 monitor, UI lebih bagus)

Better Stack (dulu Better Uptime) cocok kalau ingin dashboard lebih cantik.

### Setup

1. **Daftar**: https://betterstack.com/users/sign-up
2. **Create Monitor** → HTTP/HTTPS
   - URL: `https://YOUR-DOMAIN.com/health/detailed`
   - Check frequency: **3 minutes**
   - Request timeout: **30s**
   - Verify SSL: Yes
   - Expected status: **200-299**
   - Required keyword: `"status":"ok"`
3. **Add escalation policy**: email tim IT, optional SMS untuk severity tinggi.
4. **Heartbeat monitor** (bonus): bisa monitor cron job juga.

---

## OPSI 3: PINGDOM (Berbayar tapi enterprise-grade)

Untuk skala enterprise. Mulai $10/bulan. Lewati kalau team kecil.

---

## OPSI 4: SELF-HOSTED (Gratis, untuk yang prefer in-house)

### A. UptimeKuma (open source UptimeRobot clone)

```bash
# Di server terpisah dari app server (penting!)
docker run -d --restart=always -p 3001:3001 \
  -v uptime-kuma:/app/data \
  --name uptime-kuma \
  louislam/uptime-kuma:1
```

Akses `http://server-ip:3001`, setup admin, lalu add monitor dengan URL health check.

### B. Healthchecks.io self-hosted

https://github.com/healthchecks/healthchecks — cocok untuk monitoring cron job juga.

---

## CARA INTERPRETASI RESPONSE /health/detailed

```json
{
  "status": "ok",                    // "ok" | "degraded" | "down"
  "timestamp": "2026-05-18T...",
  "uptime": { "ms": 12345, "human": "10h 30m" },
  "database": {
    "connected": true,
    "latencyMs": 5                   // alert jika > 100ms konsisten
  },
  "memory": {
    "heapUsedMB": 120,               // alert jika > 400 (warn) atau 700 (critical)
    "rssMB": 240
  },
  "counters": {
    "activeRefreshTokens": 15,       // monitoring user aktif
    "recentAuditLogs": 234           // aktivitas 24 jam terakhir
  },
  "thresholds": {
    "memWarnMB": 400,
    "memDownMB": 700
  }
}
```

### Status Codes
- `200 OK` + `"status":"ok"` → semua sehat
- `200 OK` + `"status":"degraded"` → masih jalan tapi ada warning (memory tinggi)
- `503 Service Unavailable` + `"status":"down"` → DB tidak konek atau memory critical

### Alert Rules yang Disarankan

| Kondisi | Severity | Aksi |
|---------|----------|------|
| `/health/detailed` return 503 | 🔴 CRITICAL | SMS + email ke IT lead |
| `/health` tidak respon 2 menit | 🔴 CRITICAL | SMS + email |
| `status: degraded` selama > 15 menit | 🟡 WARNING | Email |
| DB `latencyMs > 500` konsisten | 🟡 WARNING | Email |
| `memory.heapUsedMB > 600` | 🟡 WARNING | Email (preview sebelum down) |

---

## TAMBAHAN: LOG AGGREGATION (OPSIONAL)

Untuk production yang serius, pertimbangkan juga:

### Logtail / Better Stack Logs (gratis 1 GB/bulan)
- Pino logs (yang sudah dipakai sistem) bisa langsung di-stream ke Logtail
- Install: `npm install @logtail/pino`
- Setup di `server.js`:
  ```js
  import { Logtail } from "@logtail/node"
  import { LogtailTransport } from "@logtail/winston"  // atau pino transport
  ```

### Grafana Loki (self-hosted)
- Kalau sudah punya stack monitoring sendiri
- Free, scalable, query language seperti Prometheus

---

## CHECKLIST MONITORING SUDAH AKTIF

- [ ] UptimeRobot/Better Stack/UptimeKuma monitor untuk `/health` aktif
- [ ] Monitor untuk `/health/detailed` dengan keyword `"status":"ok"`
- [ ] Alert contact email tim IT setup
- [ ] Test alert: matikan server sebentar, pastikan email masuk
- [ ] Public status page (opsional) di-share ke management
- [ ] Slack/Telegram integration (opsional)

---

## TROUBLESHOOTING

**Q: Monitor selalu "down" padahal app jalan**
- Cek firewall/security group: port aplikasi (default 3001) harus reachable dari internet
- Cek nginx/reverse proxy: pastikan `/health` di-forward, bukan di-cache
- Cek SSL: jika pakai HTTPS, pastikan sertifikat valid

**Q: `/health/detailed` lambat / timeout**
- Kemungkinan DB query lambat. Cek `database.latencyMs` di response.
- Jika consistently > 1 detik, cek indexes & connection pool.

**Q: Sering false alarm "degraded"**
- Naikkan threshold di .env: `HEALTH_MEM_WARN_MB=500`
- Restart aplikasi setelah ubah env

---

**Status: Sistem health check siap untuk monitoring** ✅
