Testsigma Status logo

Testsigma Status Status Page

Testing & Quality · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View Testsigma Status status page ↗

Components

Testsigma Web App
Operational
Testsigma Mobile Inspector
Operational
Testsigma Execution Engine
Operational
Testsigma Visual Testing
Operational

Recent Incidents

Some Customers Experienced Testsigma Terminal Connection Issues Due to Configuration Problem

minor

May 22, 2026 · resolved May 22

### **Incident Summary** On **May 22nd**, **some customers** faced issues with their local **Testsigma Terminal** connections. The incident was caused by a configuration problem that affected the terminal’s ability to establish or maintain connections correctly for a limited set of customers. The issue was detected at around **4:30 AM UTC**, and the engineering team immediately started investigating the problem. After identifying the configuration gap, the team applied the necessary fix and validated the terminal connectivity. The issue was fully resolved by approximately **6:00 AM UTC**. ### **Impact** Only **some customers** using the local Testsigma Terminal experienced connection issues during the incident window. The issue did not impact all customers. **Incident window:** **May 22nd, 4:30 AM UTC to 6:00 AM UTC** **Affected component:** Testsigma Terminal connections **Customer impact:** A limited set of customers were unable to use terminal-based connectivity features during the affected period. ### **Root Cause** The incident was caused by a configuration problem that had previously gone unnoticed. This configuration issue impacted the ability of **some local Testsigma Terminal instances** to connect successfully. ### **Resolution** The team identified the configuration issue and applied the required fix. After the fix was implemented, the team validated the terminal connection flow and confirmed that the affected functionality was restored for the impacted customers. Customers who were impacted may be required to update their local Testsigma Terminal to ensure they are using the latest corrected version. ### **Preventive Actions** To reduce the likelihood of similar incidents in the future, we are taking the following actions: 1. Automating the management and validation of configuration-related changes. 2. Improving internal checks to detect configuration issues earlier. 3. Updating the local Testsigma Terminal to better handle configuration changes on customer machines. 4. Strengthening release validation around terminal connectivity scenarios. ### **Current Status** The issue has been resolved, and terminal connectivity has been restored for the impacted customers. We will continue improving our configuration management and terminal handling mechanisms to prevent similar incidents in the future.

Application Availability Issue – US Region

none

Mar 19, 2026 · resolved Mar 19

We experienced an issue affecting application availability in the US region, where some users may have faced difficulty accessing the platform or experienced degraded performance. Initial investigation indicated that the issue was caused by a hardware-related problem at the server level. Due to this, the auto-scaling mechanism took some time to respond and provision additional capacity, which led to a temporary delay in service recovery. All services are now fully operational, and system performance has returned to expected levels. We are actively monitoring our systems to ensure everything remains stable and fully operational.

Application Not Accessible - EU Region

critical

Dec 16, 2025 · resolved Dec 16

# **Root Cause Analysis \(RCA\) – Database Service Disruption** ## **Incident Summary** On **16th December, 2025**, the database service experienced an unexpected outage caused by **heavy load generated from an unintended, resource-intensive operation**. The surge in load exceeded the database’s handling capacity, leading to service instability and eventual crash. ## **Impact** * **Affected Services:** Database-dependent application functionalities * **Customer Impact:** Intermittent application unavailability and degraded performance * **Regions Affected:** EU Region * **Duration:** 10:28 UTC to 16:37 UTC ## **Root Cause** The root cause of the incident was an **unintended execution of a heavy, non-optimized database operation**, which generated an abnormal spike in load. This operation was not adequately throttled or isolated and therefore consumed excessive database resources, leading to system exhaustion and crash. ## **Contributing Factors** * Lack of safeguards to prevent or limit execution of high-cost database operations * Insufficient load-aware throttling and query governance for resource-intensive tasks * Limited early warning signals for sudden abnormal load patterns ## **Resolution** * The offending operation was **immediately stopped** * Database services were **restarted and stabilized** * System performance was closely monitored to ensure full recovery * No data loss was observed ## **Preventive & Corrective Actions** To prevent recurrence and improve system resilience, the following actions will be implemented: 1. **Operational Safeguards** * Introduce strict controls and validation for high-load database operations * Enforce execution limits and safety checks for resource-intensive tasks 2. **Load Management & Scalability** * Implement throttling, rate-limiting, and workload isolation mechanisms * Improve query optimization and execution governance 3. **Monitoring & Alerting** * Enhance monitoring to detect abnormal load patterns earlier * Add proactive alerts for sudden spikes in database resource usage 4. **Architecture Improvements** * Design heavily loaded operations to execute asynchronously and in a scalable manner * Ensure graceful degradation instead of service disruption under extreme load ## **Current Status** * **Incident Status:** Resolved * **System Health:** Stable and operating normally * **Ongoing Monitoring:** Active

Application Not Accessible - EU Region

critical

Dec 11, 2025 · resolved Dec 11

We have resolved the database server incident that impacted customers in the Europe region. Services have been restored, and we are closely monitoring the system for any further issues.

Degraded Performance

none

Sep 25, 2025 · resolved Sep 29

# **Root Cause Analysis \(RCA\) – Database Degraded Performance** **Incident Period:** Over the last one week **Impact:** Brief degraded performance for a few seconds during high-load periods **Current Status:** Stable – no degradation observed in the last 72 hours ## **1. Summary of Issue** During the past week, the database experienced short intervals of degraded performance. The degradation was caused by a combination of application-level inefficiencies and sudden load spikes from customer activity, which collectively pushed the database beyond its scaling thresholds. ## **2. Root Causes** 1. **Auditing Microservice Inefficiency** * The auditing service was generating excessive database queries. * Lack of proper batching and query optimization caused unnecessary DB load. 2. **Archival Service Behavior** * The archival microservice attempted to process and archive very large volumes of historical test run data at once. * This created long-running transactions and high I/O consumption. 3. **Traffic Spike from Customers** * A few customers generated unusually heavy workloads in a short span of time. * This compounded the load already caused by the auditing and archival services. 4. **Autoscaling Delay** * The database autoscaling mechanism did not get sufficient time to react to the sudden spike. * This led to short bursts of unhandled load before stabilization. ## **3. Contributing Factors** * Lack of proper throttling in archival tasks. * Insufficient safeguards in the auditing logic. * Sudden surge in concurrent customer traffic coinciding with heavy background jobs. ## **4. Corrective Actions Taken** 1. **Code Optimizations** * Auditing service queries were optimized to reduce redundant load. * Archival microservice was redesigned to archive data in smaller, controlled batches. 2. **Load Distribution** * Some high-volume customers were migrated to new infrastructure with isolated database clusters. * This reduces the risk of one customer’s workload impacting others. 3. **Proactive Monitoring Enhancements** * New monitoring dashboards and alerts were introduced to catch early signs of DB stress. * Query performance metrics and background job load are now tracked in real time. ## **5. Current Status** * No further performance degradations observed in the last **72 hours**. * Database performance metrics remain within normal operating thresholds. * Teams are continuously monitoring to ensure long-term stability. ## **6. Preventive Measures & Next Steps** * Implement **query rate limiting** for background jobs \(archival, auditing\). * Introduce **staggered scheduling** of large archival tasks. * Enhance **autoscaling policy** to better handle sudden spikes. * Conduct a **load test simulation** periodically to validate resilience. ✅ **Conclusion:** The degraded performance was the result of combined factors: inefficient auditing, aggressive archival processing, and sudden traffic spikes. With the applied optimizations, customer migration, and improved monitoring, the system has stabilized, and no further degradation has been observed. Continuous monitoring and preventive measures will ensure long-term reliability.

Get alerted when Testsigma Status goes down

Alert24 monitors Testsigma Status and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Testing & Quality status pages