Courier logo

Courier Status Page

Email & Communication · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View Courier status page ↗

Components

Web Application
Operational
API
Operational
Automations
Operational
Observability
Operational
Courier Inbox
Operational

Recent Incidents

Slack Deliverability Issues

minor

May 27, 2026 · resolved May 27

Please continue to monitor Slack's statuspage for any updates related to message deliverability: https://slack-status.com/

Courier Multi Service Outage

critical

Feb 2, 2026 · resolved Feb 2

# RFO: February 2, 2026 — Service Interruption ## Executive Summary On February 2, 2026, a deployment introduced a configuration change that referenced a file not present in our production build artifacts, causing backend services to become unavailable. API endpoints, automations, and tracking functionality were impacted for 2 hours and 17 minutes in the US region and 2 hours and 42 minutes in the Ireland region. Recovery was prolonged by a concurrent outage at GitHub Actions, our CI/CD provider, which prevented our standard automated rollback. Our team identified the root cause, executed a manual rollback independent of the affected provider, and restored full service across all regions. We have defined targeted action items to prevent recurrence. ## Incident Overview * **Affected services:** API endpoints \(including message sending\), automations, webhooks, tracking links, and authentication * **Impact:** Requests to backend services returned errors for the duration of the incident. Users were unable to send messages, trigger automations, or access tracking data. No data was lost — requests were rejected before ingestion, so no messages were partially processed or left in an inconsistent state. * **Detection:** Our monitoring systems flagged elevated error rates within minutes of the issue beginning. * **Contributing factor:** GitHub Actions, our CI/CD provider, experienced a complete outage from 10:35 to 16:30 PST. This overlapped with our incident window and prevented our standard automated rollback from executing, extending the time to resolution. ## Timeline of Events All times in PST. | **Time** | **Event** | | --- | --- | | 10:59 | Deployment of latest release initiated through standard CI/CD pipeline | | 11:44 | Deployment completed and went live; services immediately began experiencing errors due to a missing configuration dependency | | 11:49 | GitHub Actions, our CI/CD provider, experienced a complete outage, preventing standard rollback procedures | | 11:52 | Monitoring alerts triggered; engineering team engaged | | 12:00 | Incident declared; rollback initiated; engineering team assembled | | 12:07 | Status page updated — issue identified and rollback in progress. Rollback ends up being blocked by GitHub actions outage. | | 12:39 | Team pivoted to an alternative manual rollback approach. This required testing and validating the new approach. | | 13:50 | Team executed the alternative manual rollback independent of GitHub Actions after they were fully satisfied that the new approach was safe.  | | 14:01 | Manual rollback completed in US region; services confirmed operational | | 14:26 | Ireland region deployment completed; services confirmed operational | | 15:13 | All services verified stable across all regions; incident resolved | ## Root Cause Analysis The disruption was traced to a configuration change included in the latest release. The change introduced a startup dependency on a utility file that was intended to be bundled with the deployment package. However, the file was not included in the production build artifacts. When backend services attempted to initialize, they were unable to locate the required file and could not start, resulting in all incoming requests being rejected. This discrepancy was not caught prior to production deployment because the file was present and functioning correctly in the development environment. The difference in how build artifacts are assembled between development and production environments meant the issue only manifested in production. ## Mitigation and Resolution 1. Upon identifying the root cause, the team immediately initiated a rollback to the prior known-good release through our standard CI/CD pipeline. 2. A complete outage at GitHub Actions, our CI/CD provider, prevented the automated rollback from completing. The team identified this external dependency and pivoted to an alternative approach. 3. The team executed a manual rollback by retrieving prior deployment artifacts from our backup storage and deploying them directly, bypassing the affected CI/CD pipeline entirely. 4. Services were restored region by region, with the US region confirmed operational at 14:01 PST and the Ireland region at 14:26 PST. 5. Extended monitoring was conducted across all services before declaring the incident fully resolved at 15:13 PST after we were satisfied everything had been stable for over 45 minutes.  ## Action Items | **#** | **Action Item** | **Owner** | **Priority** | **Status** | | --- | --- | --- | --- | --- | | 1 | Require successful staging deployment and smoke tests before any production deployment | Engineering | P1 | In Progress | | 2 | Improve the reliability of automated smoke tests | Engineering | P1 | In Progress | | 3 | Add build-time validation to confirm all referenced startup dependencies are present in deployment packages | Engineering | P2 | Open | | 4 | Adopt an expedited rollback process as the standard emergency procedure, independent of CI/CD provider availability, reducing recovery time by approximately 35 minutes | Engineering | P2 | In Progress |

Observability Degradation

minor

Jan 29, 2026 · resolved Jan 29

The Courier team identified an issue affecting observability metrics between 21:38 UTC and 22:13 UTC, during which metrics briefly experienced an outage. A fix was released and metrics are stabilized through observability channels. Metrics received during the outage window will not be accounted for in observability dashboards.

Courier Web App Performance

minor

Jan 14, 2026 · resolved Jan 15

Fix is out and application is stable

All services impacted

critical

Oct 20, 2025 · resolved Oct 20

The incident has been resolved

Get alerted when Courier goes down

Alert24 monitors Courier and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Email & Communication status pages