Incident Management

Communicate with your users through a structured incident lifecycle — from investigation to resolution.

Creating incidents

Incidents can be created in two ways:

Automatic

When the alert evaluator detects a confirmed outage (2 of 3 regions down), an incident is automatically created and linked to the affected monitor's status page components.

Manual

Create an incident from the Incidentspage in your dashboard. This is useful for planned announcements, known issues that monitors haven't caught, or pre-emptive communication about degraded services.

When creating a manual incident, you'll provide:

  • Title — A clear, concise description of the issue
  • Affected components — Which services are impacted
  • Impact level — None, Minor, Major, or Critical
  • Initial status— Usually "Investigating"
  • Initial message — What you know so far

Status lifecycle

Every incident moves through a defined lifecycle. Each status transition is recorded as an update visible on your public status page.

1

Investigating

You're aware of the issue and are looking into it. Affected components are updated to reflect the incident impact.

2

Identified

The root cause has been found. Communicate what went wrong and what you're doing to fix it.

3

Monitoring

A fix has been deployed and you're watching for recurrence. Services should be recovering.

4

Resolved

The incident is over. Affected components are reset to 'Operational'. A resolved_at timestamp is recorded.

Writing effective incident updates

Good incident communication builds trust. Follow these guidelines:

Be specific about impact

'Users in the EU region may experience 5-10 second delays when loading dashboards' is better than 'Some users may be affected'.

State what you know and don't know

'We've identified a database connection pool issue. We're still determining the root cause of the pool exhaustion.'

Provide an ETA when possible

'We expect to deploy a fix within the next 30 minutes' — even a rough estimate is better than silence.

Update regularly

Post an update at least every 30 minutes during active incidents, even if it's just 'Still investigating, no new information.'

Public incident timeline

Your status page displays incidents in two sections:

  • Active incidents — Pinned at the top of the page with impact badges and the full update timeline (newest first)
  • Past incidents — The last 14 days of resolved incidents, shown in a collapsed view at the bottom of the page

Each update in the timeline shows a status badge, message text, and timestamp. This gives your users a clear narrative of what happened and how you responded.

Maintenance windows

Pro & Business

Schedule planned maintenance to inform your users ahead of time and prevent false alerts during the window:

  • Title and description — What maintenance is being performed
  • Affected components — Which services will be impacted
  • Start and end time — The maintenance window duration

During a maintenance window:

  • Linked monitors are automatically paused — no checks are executed and no alerts fire
  • The status page displays the maintenance notice with a distinctive blue/neutral style
  • When the window ends, monitors automatically resume

Upcoming maintenance windows are also displayed on your public status page so users can plan accordingly.