PubNub logo

PubNub Status Page

Telephony & CPaaS · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View PubNub status page ↗

Components

Publish/Subscribe Service
Operational
Website
Operational
Functions Service
Operational
North America Points of Presence
Operational
Storage and Playback Service
Operational
Administration Portal
Operational
European Points of Presence
Operational
Vault
Operational
Stream Controller Service
Operational
SDK Documentation
Operational
Asia Pacific Points of Presence
Operational
Key Value store
Operational
Presence Service
Operational
PubNub Support Portal
Operational
Southern Asia Points of Presence
Operational
Scheduler Service
Operational
Access Manager Service
Operational
DNS Service
Operational
Mobile Push Gateway
Operational
App Context Service
Operational

Recent Incidents

Connectivity Issues Affecting a Subset of Subscriptions

none

Mar 24, 2026 · resolved Mar 24

### **Problem Description, Impact, and Resolution**  On March 24, 2026, at 19:27 UTC, one network shard experienced intermittent connectivity affecting a subset of customers. The affected users may have experienced elevated latency and temporary error responses related to their subscription requests. The instability was caused by an atypical surge in message volume within a shared processing environment that had improperly configured resource limits. This led to high resource utilization and triggered automated system restarts. PubNub Engineering resolved the issue by implementing the proper limits after expanding infrastructure capacity to accommodate the increased load. Service was fully stabilized once the environment was tuned to the new traffic profile. ### **Mitigation Steps and Recommended Future Preventative Measures**  **Infrastructure Tuning:** Adjusted automated scaling parameters to provide greater headroom for rapid traffic fluctuations. **Enhanced Traffic Management:** Deployed refined monitoring heuristics to better isolate and manage high-volume traffic patterns without impacting shared resources. **Dynamic Resource Allocation:** Accelerating the rollout of enhanced vertical scaling technology to allow individual processing nodes to adapt more fluidly to demand spikes. **Operational Coordination:** Strengthening internal protocols for high-capacity events to ensure large-scale traffic shifts are proactively transitioned to dedicated environments.

Delay in Publishing Messages to Storage Globally

minor

Jan 1, 2026 · resolved Jan 1

### **Problem Description, Impact, and Resolution**  On January 1, 2026 at 00:00 UTC, we observed elevated latency in our History service across multiple regions. Customers may have experienced delays in message persistence and history availability during this period. The issue was caused by a mismatch in newly created persistence tables. Specifically, required columns for message metadata were missing from the new tables, resulting in failed write operations and backed-up queues. This created downstream pressure on our storage systems, leading to higher latency in history processing. We mitigated the issue by manually applying the correct updates across all affected persistence spaces. After the updates were applied, message processing returned to normal and queue latency cleared. This issue occurred because we did not have proper controls in place to ensure schema consistency for newly generated monthly persistence tables. ### **Mitigation Steps and Recommended Future Preventative Measures**  To resolve the issue, we manually applied the required schema updates globally. In the coming days, we will update our change management processes to ensure schema changes are correctly applied to all future monthly tables. We are also auditing our schema tracking and automating validation to prevent inconsistencies across environments. These improvements will ensure that future table generation includes all necessary columns and reduce the risk of similar issues impacting History service performance.

Increased errors observed and resolved

minor

Nov 20, 2025 · resolved Nov 20

**Problem Description, Impact, and Resolution** Starting at 16:46 UTC on Nov. 20, 2025, we noticed a small number of errors with the publish API in the North American and Asia Pacific regions. The system automatically recovered with all functionality fully restored by 16:50 UTC on Nov. 20, 2025.

Increased latency and errors observed in US-West

minor

Oct 20, 2025 · resolved Oct 20

### **Problem Description, Impact, and Resolution**  On October 20th, 2025 at 07:06 UTC, our monitoring systems alerted us to elevated error levels across multiple PubNub services in the IAD region \(US-East\). Some customers may have experienced increased error rates and latency, as well as intermittent issues with Presence service availability across IAD \(US-East\), SJC \(US-West\), and HND \(AP-Northeast\). We quickly determined the issue was caused by a broader infrastructure outage affecting our cloud provider \(AWS\) in the IAD region. We initiated regional failover procedures and re-routed new connections to alternate regions. However, due to undefined steps in some of our failover processes and delays accessing some tools due to the provider issue, existing connections for some services remained degraded for longer than expected. To restore full service, we manually reset established connections, re-routed Presence traffic to Frankfurt \(EU-Central\), and brought on additional infrastructure in other regions to absorb traffic. Errors were mitigated by 09:20 UTC. Later in the day, additional regional load in US-West triggered a new wave of service degradation. We responded by isolating the US-East region again and scaling up balancer capacity in US-West. PubNub services were stabilized by 13:20 UTC, and remained in a monitoring state while our infrastructure provider worked to fully resolve the underlying issue. By 22:35 UTC, our provider reported full restoration of service. After validating stability in US-East, we completed rebalancing traffic by 23:48 UTC, and declared the incident resolved. ### **Mitigation Steps and Recommended Future Preventative Measures**  While this incident was caused by an external infrastructure outage, we’ve identified several opportunities to strengthen our internal readiness and response procedures. We are consolidating and centralizing our regional failover procedures to ensure they are immediately accessible and complete for all production services. Any gaps in our process documentation for newer services will be addressed to ensure readiness before they are fully adopted into production. Additionally, we are reviewing and resolving issues with internal tooling, including inventory and DNS resolution problems, which made mitigation more difficult during the incident. These improvements will ensure faster and more consistent responses to future infrastructure-level disruptions, and reduce potential impact on customer traffic across regions.

Elevated latencies and errors for multiple services in US-west and US East

minor

Oct 20, 2025 · resolved Oct 20

### **Problem Description, Impact, and Resolution**  On October 20th, 2025 at 07:06 UTC, our monitoring systems alerted us to elevated error levels across multiple PubNub services in the IAD region \(US-East\). Some customers may have experienced increased error rates and latency, as well as intermittent issues with Presence service availability across IAD \(US-East\), SJC \(US-West\), and HND \(AP-Northeast\). We quickly determined the issue was caused by a broader infrastructure outage affecting our cloud provider \(AWS\) in the IAD region. We initiated regional failover procedures and re-routed new connections to alternate regions. However, due to undefined steps in some of our failover processes and delays accessing some tools due to the provider issue, existing connections for some services remained degraded for longer than expected. To restore full service, we manually reset established connections, re-routed Presence traffic to Frankfurt \(EU-Central\), and brought on additional infrastructure in other regions to absorb traffic. Errors were mitigated by 09:20 UTC. Later in the day, additional regional load in US-West triggered a new wave of service degradation. We responded by isolating the US-East region again and scaling up balancer capacity in US-West. PubNub services were stabilized by 13:20 UTC, and remained in a monitoring state while our infrastructure provider worked to fully resolve the underlying issue. By 22:35 UTC, our provider reported full restoration of service. After validating stability in US-East, we completed rebalancing traffic by 23:48 UTC, and declared the incident resolved. ### **Mitigation Steps and Recommended Future Preventative Measures**  While this incident was caused by an external infrastructure outage, we’ve identified several opportunities to strengthen our internal readiness and response procedures. We are consolidating and centralizing our regional failover procedures to ensure they are immediately accessible and complete for all production services. Any gaps in our process documentation for newer services will be addressed to ensure readiness before they are fully adopted into production. Additionally, we are reviewing and resolving issues with internal tooling, including inventory and DNS resolution problems, which made mitigation more difficult during the incident. These improvements will ensure faster and more consistent responses to future infrastructure-level disruptions, and reduce potential impact on customer traffic across regions.

Get alerted when PubNub goes down

Alert24 monitors PubNub and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Telephony & CPaaS status pages