Monitoring and alerting
This page lists the minimum set of alerts and dashboards needed to run Vluna reliably.
Dashboards (recommended)
- API request rate, latency, and error rate
- idempotency replay rate and conflicts
- rate limit events
- wallet insufficient balance events
- queue backlog and processing latency (if applicable)
- Postgres: CPU, connections, replication lag, disk, slow queries
Alerts (recommended)
| Alert | Why |
|---|---|
| sustained 5xx rate | indicates service instability |
| sustained 429 rate | indicates capacity issues or misconfigured limits |
| rising 409 conflicts | indicates client idempotency misuse |
| rising 402 rate | indicates funding or enforcement issues |
| webhook backlog or lag | indicates billing state may drift |
| database connection saturation | indicates pool sizing issues |
Runbook expectations
Every alert should link to a runbook that includes:
- how to confirm impact
- mitigation steps
- rollback steps
- how to collect support data (
X-Request-Id, error code, correlation ids)