The Things Industries logo

The Things Industries Status Page

IoT & Hardware · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View The Things Industries status page ↗

Components

Europe 1 (eu1)
Operational
Packet Broker Integration
Operational
Europe 2 (eu2)
Operational
Cellular data for The Things Industries gateways
Operational
North America 1 (nam1)
Operational
Australia 1 (au1)
Operational
Asia 1 (as1)
Operational
Join Server
Operational
Gateway Controller
Operational

Recent Incidents

Issue with AWS-IOT integration in The Things Stack Cloud NAM1 Cluster

minor

May 6, 2026 · resolved May 6

This incident has been resolved.

Gateways disconnected from The Things Stack Cloud in the nam1 cluster

major

Mar 23, 2026 · resolved Mar 23

## Summary On March 23, 2026, during a scheduled maintenance window, the Gateway Server \(GS\) component in the NAM1 region was accidentally restarted, causing gateways to disconnect in a similar pattern to the March 3 incident. The development team took the opportunity to roll out a fix that had been planned for the next maintenance window. The fix resolved the reconnection issue and gateways are now reconnecting within several minutes. ## Impact Some gateways got disconnected for some of the tenants in the NAM1 region following an accidental Gateway Server restart during a maintenance window. ## Root Cause The incident was triggered by an accidental restart of the Gateway Server component in NAM1 during a scheduled maintenance window, causing gateways to disconnect in the same pattern observed during the March 3 incident — where simultaneous reconnects under high server load led to premature connection drops due to insufficient timeout and cache configurations. ## Resolution The development team used the opportunity to roll out a fix ahead of its planned release date. The deployed fix resolved the reconnection bottleneck, and affected gateways are now reconnecting within several minutes. No customer action is required. ## Prevention / Action items ### Process improvements Procedures around component restarts during maintenance windows will be reviewed to prevent accidental restarts of production-critical components such as the Gateway Server. ### Infrastructure improvements — already applied The fix rolled out during this incident addresses the reconnection performance issues identified in the March 3 post-mortem. Gateway Server instances in NAM1 are now able to handle mass reconnect scenarios significantly more efficiently, with reconnection times reduced to within several minutes.

Gateway Connectivity Issues

major

Mar 3, 2026 · resolved Mar 5

## Summary On Mar 3, 2026 , it was reported that some gateways lost connection to the LNS for some of the tenants in NAM1 region. This was triggered by an AWS update procedure which has affected the Gateway Server component. Although the affected gateways reconnected eventually, it took longer than expected \(8 hours for some tenants\). ## Impact * Some gateways got disconnected for some of the tenants in NAM1 region. * Three Gateway Server instances restarted during the incident, disconnecting a large number of gateways which failed to immediately reconnect to remaining active instances. * The number of connected gateways kept declining gradually. * Eventually, the affected gateways have reconnected and service has recovered without intervention. ## Root Cause The incident was triggered by an AWS infrastructure event \(task retirement\), which caused several Gateway Server instances in NAM1 to undergo a rolling restart. As instances restarted one by one, gateways began disconnecting gradually. Since the restart was rolling rather than simultaneous, some gateways maintained their connection to instances that remained active throughout the event. The root cause of the prolonged recovery, however, was a short connection timeout configured on some gateways. With a large number of gateways attempting to reconnect simultaneously, the Gateway Server was operating under unusually high load — and the short timeout was insufficient under these conditions, causing connections to close prematurely before they could be fully established. This cycle repeated until the restarted instances completed their post-restart operations — at which point server load normalised and Gateway Server caches became available, significantly speeding up the connection process for the remaining disconnected gateways until service was fully restored. In short: the AWS infrastructure event triggered the affected gateways disconnects, but the timeout misconfiguration is what made the recovery take up to 8 hours. ## Resolution There was no manual intervention to resolve this incident. The affected gateways reconnected automatically after the downtime. ## Prevention / Long-term improvements ### Proactive outreach to affected tenant owners regarding gateway connection timeout A minimum 60-second timeout is necessary for reliable connection establishment under high server load conditions. We will be reaching out to affected tenant owners, recommending that the `TC_TIMEOUT` setting in their Basic Station configuration is set to at least the default value of `60s`. This change will help prevent premature connection drops during periods of elevated reconnect activity. ### Documentation improvements Existing documentation will be improved to specifically address recommendations for longer `TC_TIMEOUT` setting of the Basic Station configuration. ### Infrastructure improvements Our Cloud infrastructure configuration will be improved to reduce and accommodate higher instance load post-restart.

The Things Gateway Controller down for emergency maintenance

none

Jan 9, 2026 · resolved Jan 9

The incident has been resolved. The Things Indoor Gateway Pro gateways are reconnecting. This may take some additional time due to reconnect backoffs.

Webhook failures in TTSC eu1 cluster

minor

Dec 19, 2025 · resolved Dec 19

A significant increase in downlink queue operations within a short time window and this caused internal database connections to accumulate as each request waited for exclusive access to device resources. This connection buildup exhausted the available connection pool, resulting in degraded performance for other operations across the Application Server. Unrelated requests experienced increased latency and timeouts as they competed for the limited available connections. Measures were taken so that this incident do not happen again.

Get alerted when The Things Industries goes down

Alert24 monitors The Things Industries and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More IoT & Hardware status pages