xMatters logo

xMatters Status Page

IT Management & MSP · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View xMatters status page ↗

Components

Web Interface
Operational
Web Interface
Operational
Web Interface
Operational
Email Notifications
Operational
Email Notifications
Operational
Email Notifications
Operational
SMS Notifications
Operational
SMS Notifications
Operational
SMS Notifications
Operational
Voice Notifications
Operational
Voice Notifications
Operational
Voice Notifications
Operational
Conferencing
Operational
Conferencing
Operational
Conferencing
Operational
Integration Platform
Operational
Integration Platform
Operational
Integration Platform
Operational
API
Operational
API
Operational

Recent Incidents

Issue Discovered - Service disruption in Europe Region – Web User Interface

major

May 1, 2026 · resolved May 1

**What happened?**  On May 1st, 2026, some customers reported an issue to xMatters Customer Support where attempting to send a message via the web user interface or viewing an alert on the Alerts report resulted in an error being displayed. The issue only affected the Reporting functions in the EMEA region; the system continued to accept signals, generate alerts, and send notifications across all regions.  **Why did it happen?**  The issue occurred when, during routine database maintenance, the database called a mismatched version of the library, resulting in an internal database error. The version mismatch within the cluster was traced to a prior database engine upgrade where a subset of replica nodes did not restart into the upgraded version. At no point was there any risk to data integrity.  **How did we respond?**  As soon as Customer Support confirmed the issue, they engaged the Engineering teams, who were able to identify the root cause and restore version consistency across all nodes. The teams validated stability and confirmed that all services were restored.  **What are we doing to prevent it from happening again?**  The Engineering Team has added explicit post-upgrade verification checks and monitoring to ensure node alignment is confirmed and maintained. This will provide safeguards to ensure node version alignment after upgrades and prevent this issue from reoccurring.

Issue Discovered - Service disruption in North American Region – API

minor

Apr 1, 2026 · resolved Apr 1

**What happened?**  On April 1st, 2026, some customers reported an issue to xMatters Customer Support where the Alerts or Notifications reports were timing out and failing to load.  **Why did it happen?**  This issue occurred because a backend service was experiencing significant unexpected load that caused report processing to be delayed. The ongoing resource constraint resulted in timeouts.  **How did we respond?**  As soon as customers reported an issue, Customer Support launched an investigation and escalated to the Engineering teams. The teams initiated rolling restart procedures for the applicable backend services to restore functionality. Once the rolling restart was completed, service was fully restored.  **What are we doing to prevent it from happening again?**  The Engineering teams are currently working to identify any possible bottlenecks that may have caused performance issues. In the interim, they have increased and adjusted resource allocation for several services to handle potential processing delays and prevent potential recurrences.

Issue Discovered - Service disruption in All Regions – SSO login to Web User Interface

minor

Mar 10, 2026 · resolved Mar 10

**What happened?**  On March 10th, 2026, some customers reported an issue to xMatters Customer Support where they were encountering a 404 error page when attempting to log in to their instances via SSO. Some users may also have encountered a “We’ve run into a problem while retrieving your data.” error message.  **Why did it happen?**  This issue occurred during a routine maintenance update to the xMatters platform. Although several components of the platform were updated, specific configurations of the component related to SSO-based authentication conflicted with the update and resulted in 404 errors. This issue was limited to those few customers that had specific criteria set for their SSO configuration.  **How did we respond?**  As soon as customers reported the issue, Customer Support verified the issue and escalated immediately to Engineering. The team traced the issue to the maintenance deployment and initiated rollback procedures to restore functionality. Once the rollback was completed, the 404 errors ceased and service was confirmed restored.  **What are we doing to prevent it from happening again?**  The Engineering teams have implemented additional testing for any configuration criteria related to the SSO-based authentication component. In addition, they have begun working on an improved maintenance plan to prevent further issues that could occur during similar deployments.

Issue Discovered - Service disruption in North American Region - Multiple Services

minor

Nov 21, 2025 · resolved Nov 21

**What happened?** On November 21, 2025, at 12:50 PM UTC, the xMatters internal monitoring tools detected irregular behavior in how internal traffic was being routed. Some customers in the APAC region communicating with services in North America \(specifically US-East\) may have encountered intermittent request failures or increased latency. Only traffic between these two regions was affected; all other systems and regions continued normal operations. **Why did it happen?** A temporary network disruption between Australia Southeast and US-East caused one internal routing node in Australia to lose accurate information about available backend systems in USEast. The node generated an incomplete routing configuration and temporarily stopped directing traffic to US-East. Under normal circumstances, routing updates refresh automatically when connectivity returns. In this case, the affected node did not recover cleanly and remained in a stale state until Engineering intervened. **How did we respond?** As soon as Engineering was alerted through internal monitoring, they engaged with the platform engineering team, service owners and Customer Support to launch an investigation. The teams reached out to impacted customers to validate issue symptoms and restarted routing components in both affected regions to force a configuration refresh. Once the restart completed, routing returned to normal levels while the teams continued to monitor and investigate the root cause. They were able to confirm that only one routing node and specific cross-region traffic was impacted. **What are we doing to prevent it from happening again?** While teams were mitigating this issue, they created new alerting rules to detect the routing patterns they observed during the incident and expanded internal monitoring to help identify when routing nodes fail to refresh their configuration or otherwise enter a ‘stale’ state. The teams also have planned and prepared infrastructure updates that will further reduce the risk of similar issues. These include improved configuration recovery behavior, enhanced stability for routing components, and additional logging and observability improvements for diagnosing routing anomalies. They will deploy these updates once the current code freeze window has elapsed.

Issue Discovered - Service disruption in All Regions – Conferencing

major

Oct 20, 2025 · resolved Oct 21

**What happened?**  On October 20th, 2025, at approximately 1:21 AM Pacific, customers began reporting an issue affecting live call routing, conferences, and voice notifications. During this issue, customers in all regions would have been affected. **Why did it happen?**  This issue was caused by a global AWS outage that impacted one of our downstream providers, resulting in voice notifications, live call routing, and conferencing they were handling to fail.  **How did we respond?**  As soon as the first customer reported an issue, Customer Support engaged the Engineering teams and launched an investigation. Once they identified the root cause, the teams updated the primary provider for affected regions to an alternate provider that was not affected by the AWS outage. When the new provider was assigned, customers reported that all services were operating correctly.  **What are we doing to prevent it from happening again?**  While this issue was out of our control or that of our provider, we are working to identify potential ways to improve resilience in case of external factors such as this.

Get alerted when xMatters goes down

Alert24 monitors xMatters and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More IT Management & MSP status pages