← Back to Blog

How to Correlate Datadog Alerts with Recent Deployments

The Moment You Dread

Your phone buzzes at 2 AM. Datadog is firing on p99 latency. You open the alert, look at the graph, and see the inflection point: something changed about 40 minutes ago. Now comes the question you will spend the next 20 minutes answering — what shipped?

You open GitHub, filter commits by time, check CircleCI for recent pipelines, ping a colleague to ask if they deployed anything. By the time you figure out that yes, someone pushed a batch processing change three hours ago that finally caught up with load, you have already wasted the most valuable minutes of incident response.

This is not a tooling problem unique to small teams. It happens everywhere, and the fix is straightforward: make deployment events first-class citizens in your incident timeline.

Why Deployment Correlation Is Hard in Practice

Datadog gives you metrics, APM traces, and logs. What it does not give you automatically is a reliable record of what your team shipped in the last four hours, timestamped in a way that lines up with your alert timeline without extra configuration.

You can use Datadog's deployment tracking features — APM deployments, version tags on services — but these require instrumentation discipline across every service. If one service misses the version tag, you are back to guessing.

The more durable approach is treating deployments as structured events you log deliberately at the end of every CI/CD job, regardless of which monitoring tool fires the alert.

The Change Log Pattern

A change log is a simple event feed: each record says "this thing changed, at this time, and here is who did it." When an alert fires, your on-call engineer can immediately see which changes preceded it by minutes or hours.

Alert24 has a change log API designed exactly for this. You send one HTTP request at the end of a deploy job, and that deployment event becomes visible in the incident context whenever an alert fires afterward. No agent to install, no service instrumentation required — just a curl call in your existing pipeline.

Logging a Deployment from GitHub Actions

Here is a complete GitHub Actions job step you can drop into any workflow:

- name: Log deployment to Alert24
  if: success()
  env:
    ALERT24_API_KEY: ${{ secrets.ALERT24_API_KEY }}
  run: |
    curl -s -X POST https://api.alert24.com/v1/changes \
      -H "Authorization: Bearer $ALERT24_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "type": "deployment",
        "summary": "Deployed ${{ github.repository }} @ ${{ github.sha }}",
        "environment": "production",
        "actor": "${{ github.actor }}",
        "metadata": {
          "sha": "${{ github.sha }}",
          "ref": "${{ github.ref_name }}",
          "workflow": "${{ github.workflow }}",
          "run_id": "${{ github.run_id }}"
        }
      }'

Place this step after your deploy step. The if: success() condition means it only fires when the deploy itself succeeded — you do not want phantom deployment events for failed pipelines.

For CircleCI, the equivalent looks like this:

- run:
    name: Log deployment to Alert24
    command: |
      curl -s -X POST https://api.alert24.com/v1/changes \
        -H "Authorization: Bearer $ALERT24_API_KEY" \
        -H "Content-Type: application/json" \
        -d "{
          \"type\": \"deployment\",
          \"summary\": \"Deployed ${CIRCLE_PROJECT_REPONAME} @ ${CIRCLE_SHA1}\",
          \"environment\": \"production\",
          \"actor\": \"${CIRCLE_USERNAME}\",
          \"metadata\": {
            \"sha\": \"${CIRCLE_SHA1}\",
            \"branch\": \"${CIRCLE_BRANCH}\",
            \"build_num\": \"${CIRCLE_BUILD_NUM}\"
          }
        }"

The payload shape is flexible. The fields that matter most for incident correlation are summary (what changed, human-readable), environment (so you can filter staging vs. production events), and the timestamp, which Alert24 sets to the moment the request arrives.

What the API Accepts

Field Type Required Notes
type string yes Use "deployment" for deploys; also accepts "config_change", "feature_flag", "rollback"
summary string yes Short human-readable description shown in the timeline
environment string recommended Filters noise; use "production", "staging", etc.
actor string recommended Who triggered the change
metadata object no Arbitrary key-value pairs; commit SHA, PR number, etc.
severity string no "low", "medium", "high" — defaults to "low"

You can also log non-deploy changes — feature flag flips, database migrations, infrastructure changes — using the same endpoint with a different type. All of these will surface in the same timeline.

How It Surfaces During an Incident

When Datadog fires an alert and Alert24 creates an incident, the incident detail view includes a change log panel. This panel shows all change events that occurred in a configurable lookback window — typically the two to four hours before the alert triggered.

Your on-call engineer sees the alert and, without switching tools, sees a list like:

  • 14:32 — Deployed payment-service @ a3f92c1 (pushed by dan)
  • 13:55 — Feature flag new_checkout_flow enabled for 100% of users (pushed by maya)
  • 13:20 — Deployed frontend @ b91d04e (pushed by CI)

If the alert inflection point is 14:35, the deployment three minutes earlier is the obvious starting point. Your engineer is not guessing anymore — they are validating a hypothesis.

This is the workflow that deployment correlation is supposed to enable: alert fires, probable cause is visible in context, investigation starts with a concrete lead instead of a search.

Handling Multiple Environments and Services

If you run multiple services or environments, log each deployment separately. The API call is cheap — it is one HTTP request per deploy job — and granularity pays off during incidents. Knowing that payment-service deployed but auth-service did not rules out half your hypotheses immediately.

You can namespace your summaries or use the metadata field to group by service:

{
  "type": "deployment",
  "summary": "Deployed payment-service @ a3f92c1",
  "environment": "production",
  "metadata": {
    "service": "payment-service",
    "team": "payments",
    "sha": "a3f92c1"
  }
}

Alert24 lets you filter the change log panel by environment and by metadata fields, so during an incident you can narrow the view to just production events for the affected service.

Next Steps

Start with your highest-risk deployment pipeline — the service that causes the most incidents when something goes wrong. Add the curl step, store your API key as a CI secret, and run a test deploy.

After the first real incident where your on-call engineer spots the culprit deployment in the timeline without digging through GitHub, you will add the step to every other pipeline.

If you log changes beyond deployments — feature flags, configuration changes, database migrations — the timeline becomes even more useful. Incidents that look like pure load spikes often turn out to be query plan regressions from a schema migration that ran two hours earlier. That kind of correlation is invisible until you make the events explicit.

The Datadog alert tells you something is wrong. The change log tells you what changed before it went wrong. Those two pieces of information together are what actually gets you to a fast resolution.