scalr.io logo

scalr.io Status Page

Developer Platforms & Tools · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View scalr.io status page ↗

Components

GitHub Webhooks
Operational
Stripe API
Operational
Google Cloud Platform Google Cloud Networking
Operational
Scalr Platform
Operational
GitHub API Requests
Operational
Stripe Checkout.js
Operational
Google Cloud Platform Google Cloud DNS
Operational
Scalr Worker
Operational
GitHub Pull Requests
Operational
Stripe JS
Operational
Google Cloud Platform Google Cloud SQL
Operational
Google Cloud Platform Google Cloud Storage
Operational
Provider Registry
Operational
Google Cloud Platform Google Compute Engine
Operational
Scalr Documentation
Operational
Google Cloud Platform Google Kubernetes Engine
Operational

Recent Incidents

Scalr Run Queue Delayed

none

May 21, 2026 · resolved May 21

This incident has been resolved.

Errors Relating to SSL

major

May 19, 2026 · resolved May 19

This incident has been resolved.

Elevated Run Errors

minor

May 18, 2026 · resolved May 18

This incident has been resolved.

Run Delays

minor

May 6, 2026 · resolved May 6

This incident has been resolved.

Workspace runs stuck in pending approval

minor

Apr 29, 2026 · resolved Apr 29

**Final Update - 5/4/2026:** The root cause was that a burst of new runs coincided with a jump in VCS and other i/o bound tasks, which increased overall task duration across the queue. CPU utilization remained moderate throughout, so the CPU-based worker autoscaling did not trigger, meaning the system did not add capacity even as the queue fell behind. This created a cascading effect: as task delays continued to grow, a race condition was exposed, causing run transition tasks to fail. The queue overload alone would have resulted in only a temporary slowdown, but the race condition caused some waiting runs to become permanently stuck with no automatic recovery. What has been done so far: * The improvement released on April 30 made run transitions significantly more stable and resilient to race conditions. * Audit log processing, which accounted for roughly a third of all task load at the time of the incident, has been moved to dedicated workers * The I/O-bound workers have more resources. Monitoring since these changes shows noticeably healthier queue behavior - fewer tasks overall and faster execution times. What will be done next: * Moving run transitions to a separate queue and fully removing the underlying race condition, so that even under heavy load runs cannot get stuck. * Optimizing run notification tasks that spiked during the incident. **Update #2 - 5/1/2026** Further investigation points to what triggered the queue slowdowns. Issues with runs getting stuck on Apr 28/29 lined up with high bursts of new runs being created at the same time as several heavier background jobs. Those jobs share the same pool of workers, and when they spiked together, the pool fell behind, which delayed the steps that move runs from one stage to the next. The resilience improvement we shipped on April 30 already addresses that delay. We are now planning changes to separate the heaviest background work from run-transition processing, so a busy moment in one area cannot slow the other. We will share another update as that work progresses. **Update #1 - 4/30/2026** On April 28 at approximately 15:00 UTC and again on April 29 at approximately 15:22 UTC, a load spike affected our internal task queue, which slowed down the mechanism Scalr uses to transition runs between pipeline stages. In some cases, the transition task failed before it could complete, leaving runs stuck in a waiting state with no automatic recovery path. Both times, the queues cleared on their own while our engineering team investigated. We have already shipped an improvement \(released April 30\) that makes this transition task significantly more resilient to lock contention, greatly reducing the likelihood of runs getting stuck. We are continuing to investigate the underlying cause of the queue spikes and will share further updates as the investigation progresses.

Get alerted when scalr.io goes down

Alert24 monitors scalr.io and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Developer Platforms & Tools status pages