# Production Observability Checklist

## Sentry Setup
- Set `SENTRY_DSN` on server runtime.
- Set `NEXT_PUBLIC_SENTRY_DSN` for client-side error capture.
- Set `SENTRY_ENVIRONMENT` (`production`, `staging`, etc).
- Set `SENTRY_RELEASE` to current commit/tag/build id.
- Set `SENTRY_TRACES_SAMPLE_RATE` based on traffic budget.
- Optional: set replay rates (`SENTRY_REPLAYS_SESSION_SAMPLE_RATE`, `SENTRY_REPLAYS_ON_ERROR_SAMPLE_RATE`).
- For source maps upload in CI: set `SENTRY_ORG`, `SENTRY_PROJECT`, `SENTRY_AUTH_TOKEN`.

## Health/Ready Endpoints
- `GET /api/health` should stay lightweight and always fast.
- `GET /api/ready` should validate env + db before routing traffic.
- Configure platform probes:
  - liveness -> `/api/health`
  - readiness -> `/api/ready`
- Alert if readiness returns `503` for more than 2 consecutive checks.

## API Error Contract
- Every API error returns `{ error, code, requestId }`.
- Preserve `error` message for backward compatibility in UI.
- Log and send only unexpected `5xx` to Sentry.
- Do not send expected business `4xx` errors to Sentry as exceptions.

## Anti-abuse and Clean Analytics
- Set `RATE_LIMIT_BACKEND=redis` in production.
- Set `RATE_LIMIT_REDIS_URL` to your Redis endpoint.
- Keep `RATE_LIMIT_BACKEND=memory` only for local/dev fallback.
- Validate critical limits:
  - `/api/click`
  - `/api/auth/*`
  - `/api/telegram/webhook`
  - `/api/links/unlock/*`
- Track bot/filtered clicks with flags in click events (`isBot`, `isFiltered`, `filterReason`).
- Ensure analytics endpoints count only non-filtered clicks.

## Incident Debug Flow
- In UI, collect `requestId` from error message.
- Correlate `requestId` in server logs and Sentry tags.
- Use `code` for aggregation (`LINK_NOT_FOUND`, `LINK_CREATE_FAILED`, etc).
- Verify if issue is user error (`4xx`) or service error (`5xx`).

## Release Verification
- Run `npm run lint`.
- Run `npm run build`.
- Run tests with available database.
- Smoke test:
  - trigger a controlled client error and confirm Sentry event;
  - trigger a controlled server `500` and confirm Sentry event;
  - validate `/api/health` => 200 and `/api/ready` => 200.
