← Back to Blog

How to Build a Public Status Page from Prometheus Monitoring Data

Prometheus Knows. Your Customers Don't.

You have Prometheus scraping every service. AlertManager is routing pages to your on-call engineer. Your dashboards show exactly when things broke and when they recovered. But your customers are still refreshing the app wondering if the outage is just them — because your status page either says "All Systems Operational" or hasn't been touched in six months.

The data is there. The problem is that nothing connects your internal monitoring pipeline to the thing customers actually look at.

This post walks through exactly that connection: from a Prometheus alert rule firing, through AlertManager, into Alert24, and out to a public status page with a real incident timeline.

How the Pipeline Fits Together

Before writing any config, it helps to see the data flow:

Stage Tool What happens
Metric threshold crossed Prometheus Evaluates alert rule, transitions to firing
Alert routed AlertManager Matches receiver, sends webhook
Incident opened Alert24 Creates incident, links to service, notifies on-call
Status page updated Alert24 Service status flips to Degraded or Down
Alert resolves Prometheus + AlertManager Sends resolved webhook
Incident closed Alert24 Status page returns to Operational, timeline finalized

Your team already owns the first two stages. The work here is wiring AlertManager to Alert24 and mapping your services correctly.

Step 1: Define a Meaningful Alert Rule

A lot of Prometheus alert rules are written for engineers, not for incident correlation. If you want the alert to map cleanly to a customer-facing service, give it explicit labels that match how you think about your services.

# alerts/api.yml
groups:
  - name: api_availability
    rules:
      - alert: APIHighErrorRate
        expr: |
          sum(rate(http_requests_total{job="api", status=~"5.."}[5m]))
          /
          sum(rate(http_requests_total{job="api"}[5m])) > 0.05
        for: 2m
        labels:
          severity: critical
          service: api
          team: platform
        annotations:
          summary: "API error rate above 5% for 2 minutes"
          description: "Current error rate: {{ $value | humanizePercentage }}"

The service: api label is what you'll use downstream to route this alert to the right Alert24 service. Keep it lowercase and consistent — you'll reference it in AlertManager and again in Alert24.

Step 2: Configure AlertManager to Send to Alert24

AlertManager uses a webhook receiver to forward alerts to external systems. Alert24 exposes an AlertManager-compatible webhook endpoint — you get the URL from the Integrations section of your account.

# alertmanager.yml
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'alert24-default'
  routes:
    - match:
        severity: critical
      receiver: 'alert24-critical'

receivers:
  - name: 'alert24-default'
    webhook_configs:
      - url: 'https://app.alert24.io/integrations/alertmanager/YOUR_INTEGRATION_KEY'
        send_resolved: true

  - name: 'alert24-critical'
    webhook_configs:
      - url: 'https://app.alert24.io/integrations/alertmanager/YOUR_INTEGRATION_KEY'
        send_resolved: true
        http_config:
          bearer_token: 'YOUR_BEARER_TOKEN'

Two things to get right here. First, set send_resolved: true — this is what closes the incident automatically when Prometheus sees the metric recover. Without it, every incident needs to be closed manually. Second, make sure group_by includes service so that separate services don't get merged into a single noisy alert group.

Step 3: Map Services in Alert24

In Alert24, create a service called "API" (or whatever matches your service label). When you configure the AlertManager integration, you map the incoming label value to this service.

The mapping screen looks something like this:

Incoming label: service = "api"  →  Alert24 Service: API
Incoming label: service = "db"   →  Alert24 Service: Database
Incoming label: service = "cdn"  →  Alert24 Service: CDN

Once this is in place, when APIHighErrorRate fires, Alert24 opens an incident against the API service and changes its status page indicator to Degraded. Your customers see that change within seconds of the alert firing — before anyone on your team has even acknowledged it.

What the Status Page Looks Like

Your public status page shows each service as a row with a current status and a 90-day uptime history bar. When an incident is open, the affected service shows Degraded or Down depending on the severity you've configured.

The incident timeline is generated automatically from the AlertManager webhook events:

14:32 UTC  Incident opened — API: High error rate detected
14:32 UTC  On-call engineer notified (PagerDuty escalation)
14:38 UTC  Incident acknowledged by Sarah M.
14:41 UTC  Update posted: "Investigating elevated 5xx errors on API servers"
14:55 UTC  Update posted: "Identified bad deploy, rollback in progress"
15:03 UTC  Resolved — API returned to normal operation
           Duration: 31 minutes

That timeline is visible to customers on the status page in real time. You don't have to remember to post updates — the acknowledgment and resolution events come through automatically. The manual updates ("Investigating...", "Rollback in progress") are the only thing your engineer needs to add, and they can do it from the Alert24 mobile app or the incident Slack thread.

Step 4: Test the Full Flow Before You Need It

Don't wait for a real outage to find out your webhook URL is wrong.

Prometheus ships with amtool for sending test alerts:

amtool alert add \
  alertname=APIHighErrorRate \
  service=api \
  severity=critical \
  --annotation=summary="Test alert from amtool"

Watch AlertManager's log to confirm it dispatched the webhook, then check Alert24 to confirm the incident opened and the status page updated. Then resolve the alert and confirm it closes:

amtool alert expire APIHighErrorRate

This also lets you confirm that send_resolved: true is working — the incident should close and the service should return to Operational without any manual action.

What You Get Out of This

Once this is running, a few things change. Your on-call engineer still gets paged, but the status page update is no longer a separate manual task they have to remember in the middle of an incident. Customers get visibility immediately. And after the incident, you have a complete timeline that feeds directly into your postmortem — start time, acknowledgment time, resolution time, and any notes posted during the incident.

The 90-day uptime history also starts accumulating real data, which is more credible to enterprise customers than a status page that's been manually set to green for years.

Next Steps

If you have Prometheus and AlertManager running today, the integration takes about 20 minutes to set up:

  1. Add the service label to your existing alert rules.
  2. Create your services in Alert24 and grab your integration webhook URL.
  3. Add the webhook receiver to alertmanager.yml with send_resolved: true.
  4. Map your service labels in the Alert24 integration settings.
  5. Run a test alert with amtool to confirm the full flow works end to end.

The status page is public by default and requires no login for customers to view. You can customize the domain, the service names, and whether incident details are visible or summarized.

If you're not yet running Alert24, you can set up a free account at alert24.io and have the status page live before your next on-call shift.