Operator Alerting

Operator alerting is the customer-visible incident layer for webhook delivery, quota, DLQ, and drain health. Use it when you need durable issue state and routed notifications instead of raw event firehoses.

Alerting model

This surface is Pro+ and project-scoped.

The model has four main objects:

  • operator webhooks for webhook-backed alert delivery
  • alert channels for project-owned routing targets, including email channels
  • fixed alert rules that bind incident families to channels
  • incidents and dispatch history

Issue families include delivery failures, backlog growth, quota pressure, secret lifecycle events, DLQ accumulation, and drain degradation.

Auth model

  • Name
    Public API
    Type
    admin project API key
    Description

    Operator notification routes live under /v1/project/... and require the admin role, not a write key.

  • Name
    Dashboard
    Type
    session auth
    Description

    The dashboard provides the same controls — channels, rules, and incidents — as a UI for project members.

  • Name
    SDK / CLI
    Description

    The first-party SDK and CLI do not currently wrap operator alerting. Use raw HTTP.

Channels, rules, and incidents

Channel and rule basics:

  • webhook-backed channels are managed through the operator-webhook API
  • email channels are managed through the alert-channel API
  • alert rules are fixed families, patched rather than user-defined from scratch
  • enabling a rule or reactivating a channel can backfill matching open incidents

Incident lifecycle highlights:

  • most issue families use open and resolved
  • secret_lifecycle is edge-triggered and uses occurred
  • incidents can be muted, unmuted, resolved, and reopened
  • manual resolve suppresses immediate reopening until recovery is observed

This complements event drains: alerting gives you durable incident state and fan-out, while drains give you the raw lifecycle stream.

API examples

Use an admin project API key for operator alerting routes. These examples use raw HTTP because the first-party SDK and CLI do not wrap this surface yet.

List alert channels

curl https://api.hooksbase.com/v1/project/alert-channels \
  -H "Authorization: Bearer swk_..."

Update one alert rule

curl https://api.hooksbase.com/v1/project/alert-rules/terminal_failure_spike \
  -X PATCH \
  -H "Authorization: Bearer swk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "channelIds": ["alch_123"]
  }'

List open incidents

curl -G https://api.hooksbase.com/v1/project/operator-incidents \
  -H "Authorization: Bearer swk_..." \
  -d status=open \
  -d limit=20

Dispatches, clusters, and audit adjacency

Dispatch phases include:

  • opened
  • reminder
  • resolved
  • occurred

Other operator views:

  • per-channel dispatch history
  • read-time failure clusters for recent failed deliveries and DLQ activity
  • project audit logs on Business+ plans for adjacent control-plane investigation

The legacy operator-webhook API still matters because webhook-backed channels use its lifecycle, signing-secret rotation, and secret-version ledger.

  • GET/POST /v1/project/operator-webhooks
  • GET/PATCH /v1/project/operator-webhooks/{id}
  • POST /v1/project/operator-webhooks/{id}/pause
  • POST /v1/project/operator-webhooks/{id}/resume
  • POST /v1/project/operator-webhooks/{id}/archive
  • POST /v1/project/operator-webhooks/{id}/restore
  • POST /v1/project/operator-webhooks/{id}/rotate-signing-secret
  • GET /v1/project/operator-webhooks/{id}/secret-versions
  • GET /v1/project/operator-webhooks/{id}/dispatches
  • GET/POST /v1/project/alert-channels
  • GET/PATCH /v1/project/alert-channels/{id}
  • GET /v1/project/alert-channels/{id}/dispatches
  • GET /v1/project/alert-rules
  • GET /v1/project/alert-rules/{family}
  • PATCH /v1/project/alert-rules/{family}
  • GET /v1/project/operator-failure-clusters
  • GET /v1/project/operator-incidents
  • GET /v1/project/operator-incidents/{id}
  • POST /v1/project/operator-incidents/{id}/mute
  • POST /v1/project/operator-incidents/{id}/unmute
  • POST /v1/project/operator-incidents/{id}/resolve
  • POST /v1/project/operator-incidents/{id}/reopen

Common mistakes

  • Using a write key and expecting access to /v1/project/operator-... routes.
  • Treating alert rules as arbitrary custom rule builders instead of fixed incident families.
  • Forgetting that webhook-backed channels are still managed through operator-webhook routes.
  • Expecting raw event streaming from alerting instead of event drains.

Was this page helpful?