Atlassian Analytics logo

Atlassian Analytics Status Page

Databases & Data Platforms · monitored by Alert24

All Systems Operational

Current Status

All Systems Operational

View Atlassian Analytics status page ↗

Components

Dashboards
Operational
Atlassian Data Lake
Operational
Third party data connections
Operational

Recent Incidents

Users experiencing issues accessing multiple Atlassian products

critical

May 14, 2026 · resolved May 14

### Summary On May 14, 2026, between 04:30 and 05:26 UTC, Atlassian customers experienced widespread service disruption across multiple Atlassian Cloud products. The issue was caused by a race condition in our internal deployment orchestration platform during a routine rollback operation of a core identity service in the us-east region. This race condition resulted in insufficient capacity for the identity service in the affected region which started returning errors to dependent products. The incident was detected within a minute by automated monitoring systems and mitigated in 56 minutes. ### **IMPACT** During the incident, customers attempting to access Atlassian Cloud products in the us-east region experienced authentication and permission failures and were unable to access services. Customers also experienced errors when accessing the support portal until Atlassian fell back to an alternate support method. This was caused by a core identity service in the us-east region becoming unavailable. Affected products included Atlassian Administration, Atlassian Analytics, Bitbucket, Compass, Confluence, Jira, Jira Product Discovery, Jira Service Management and Trello. Some users outside us-east may have been affected in certain scenarios. ### **ROOT CAUSE** The incident was caused by a race condition in our internal deployment orchestration platform during a routine rollback operation of a core identity service in the us-east region. This race condition resulted in insufficient capacity for the identity service in the affected region which started returning errors to dependent products. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. Atlassian is prioritizing the following actions to help prevent similar incidents in future: * **Refine deployment orchestration safeguards** * Harden our deployment platform to prevent similar race conditions or resulting capacity loss during a rollback operation. * Streamline mitigation steps when a service becomes unavailable in a region. * **Reduce cross-region impact** * Improve regional isolation and fallback handling so an issue affecting a single region is less likely to impact customers or product functionality in other regions. We recognise how critical reliable access to Atlassian products is for our customers' productivity, and we apologize to customers who were impacted by this incident. Thanks, Atlassian

Multiple Atlassian services are experiencing issues

minor

May 8, 2026 · resolved May 8

All dates and times below are in UTC unless stated otherwise. ### Summary On May 8, 2026 between 00:22 and 06:08, one of our hosting providers suffered a significant incident in a specific availability zone in prod-east which led to Atlassian customers experiencing degraded performance and delays of background operations and automation execution. The incident started on May 8, 2026 at 00:22 and was detected within 4 minutes by automated monitoring systems. Our teams worked to restore core access by 06:08. Final cleanup of backlogged processes and minor issues progressed in stages from there was completed iteratively by 19:15. ### **IMPACT** The primary infrastructure affected in this incident was the event processing pipeline in the prod-east region, which distributes events between Atlassian services and underpins background operations such as automation execution, search indexing, notifications, permission synchronisation. * Between 00:22 and 06:08, an infrastructure incident in our hosting provider triggered an ingestion failure in our event processing pipeline. * At 02:50, event ingestion was failed over to an unaffected availability zone, progressively restoring live event flows. * At 06:08, reliability for new ingestion in prod-east recovered to 100%. The remaining work was to drain the accumulated cross-region backlog of messages, which completed by 17:00. * By 18:48, Automation had processed their backlog of events that were created while processing **Automation** Between 00:22 and 02:50, customers with automation rules triggered by events originating from the prod-east region experienced a significant reduction in rule executions. During this window, event-triggered automation rules were not firing because the events that trigger them were not being delivered. Rule authoring, saving, and rules triggered manually, by schedules, or by webhooks were not affected. At 02:50, the event processing infrastructure failed over to an unaffected availability zone, restoring delivery of live events to Automation and allowing new event-triggered rules to begin executing normally. However, events generated during the impact window still needed to be replayed before delayed automations could be processed. Beginning at 08:28, upstream services replayed their queued events in a coordinated sequence, and all replayed events were processed by 18:48. During the replay window, customers may have experienced automation rules executing later than expected, a small number of rules reaching daily processing limits due to compressed replay, and time-sensitive rules not completing as expected if internal timeout thresholds were exceeded. **Jira and Jira Service Management** Between 00:22 and 02:50, customers with tenants hosted in the prod-east region experienced disruption to Jira and Jira Service Management event-driven features like automation, along with a short period of elevated errors during infrastructure failover. Core Jira experiences, including issue view, boards, and project navigation, remained available throughout the incident. Jira event delivery was affected by the primary impact, preventing downstream services from receiving issue lifecycle events. This affected automation rules triggered by Jira events, AI agent orchestration in Jira, notifications for issue updates and transitions, search indexing for newly created or modified issues, and event-driven integrations between Jira and other Atlassian products. At 02:50, the event processing infrastructure failed over to an unaffected availability zone, restoring delivery of new events. All events generated during the impact window were retained in a recovery queue and required replaying. This began at 08:28 and completed at 12:00. During the replay window, customers may have experienced automation rules executing later than expected, delayed notifications arriving hours after the triggering action, temporary gaps in search results for content created or modified during the impact window, and AI agent workflows not completing as expected where internal timeout thresholds were exceeded. **Confluence** Between 00:22 and 02:50, customers with tenants hosted in the prod-east region experienced disruptions to event-driven services in Confluence. This resulted in delays to search indexing, notifications, automation rule execution, and permission synchronisation. The underlying event processing infrastructure failed over to an unaffected availability zone, after which live Confluence operations resumed normally. However, events generated during the impact window were queued for replay, and some background services remained delayed until that replay and related validation work completed. Between 10:14 and 17:00, a bulk replay of all the queued tenant replay tasks was completed to restore data consistency. During and immediately after the replay window, customers may have experienced search results not reflecting content created or modified during the outage, delayed or missing notifications for page and comment activity, automation rules firing later than expected, and brief delays in permission synchronisation for tenants relying on incremental identity sync. **Bitbucket and Pipelines** Between 00:22 and 06:08, customers using Bitbucket and Pipelines experienced failures and degraded functionality across event-driven workflows. Core Git operations, including push, pull, and clone, were not affected and continued to operate normally throughout the incident. Automatic pipeline triggers initiated by push or pull request events were unavailable during the impact window. Merge queues, custom merge checks, Forge-based triggers, workspace permission changes, and some workspace provisioning flows were also affected. Customers using merge queues were unable to merge pull requests, and some pipeline steps failed because queued work contributed to elevated concurrency limits. At approximately 03:57, Pipelines was reconfigured to consume events through an alternative path, restoring automatic pipeline triggering. Merge queues, custom merge checks, Forge triggers, and other affected workflows were progressively restored as the underlying event processing infrastructure recovered. All Bitbucket and Pipelines services were confirmed fully operational by 06:08. After recovery, queued events were reviewed and replayed where safe to restore data consistency for billing, audit logging, and other background processes. **Identity Services** Between 00:22 and 02:50, customers with tenants hosted in the prod-east region experienced delays in the propagation of identity and group membership changes to downstream Atlassian products. Core identity operations, including authentication, login, and direct group management actions, were not affected and continued to function normally throughout the incident. The impact was limited to asynchronous, event-driven operations that depend on the event processing pipeline. This included delays in delivering group membership and user profile changes to products such as Jira and Confluence, which affected downstream permission synchronisation and crowd sync flows. A small number of SCIM-based identity synchronisation and site provisioning workflows also experienced temporary delays. After the event processing infrastructure recovered, backed-up identity and group directory events were replayed where required, restoring downstream consistency for affected products. No identity data was lost. Group membership changes, user profile updates, and provisioning-related events that occurred during the impact window were retained and processed after recovery. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know outages impact your productivity. While our monitoring and recovery processes helped us respond quickly, this incident highlighted opportunities to further strengthen resilience for event-driven services. We are prioritizing improvements that will: * **Enhance failover coverage** so critical event processing can recover more smoothly during infrastructure disruptions. * **Strengthen recovery handling** so replayed events can be processed more quickly. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support.

Multiple products impacted by search failures

minor

Apr 8, 2026 · resolved Apr 8

### Summary On April 8, 2026, between 04:46 UTC and 12:09 UTC, search functionality was unavailable or degraded across several Atlassian Cloud products, including Jira, Confluence, Jira Service Management, Rovo, Rovo Dev, Loom, Guard Standard, Customer Service Management and Atlassian Administration. A configuration change increased the resources reserved for a core system component that runs on nodes in our compute platform. On a subset of clusters configured for high‑density workloads, the increased reservations exceeded available node capacity interrupting search and related experiences for affected customers. The root cause was identified and a rollback was merged at 05:42 UTC with some systems seeing recovery by 07:33 UTC**.** Core search functionality was restored approximately by 08:55 UTC, and full downstream recovery completed by 12:09 UTC. ### **IMPACT** During the impact period, some customers experienced outages or degradation in search across Jira, Confluence, Jira Service Management, Rovo, Rovo Dev, Loom, Guard Standard, Customer Service Management and Atlassian Administration. Other experiences that rely on search such as quick find, navigation, AI assistants, dashboards, were also intermittently affected during this period. Impacted customers may have been unable to find pages or recordings and experienced degraded performance in finding issues; received empty or delayed search results; or experienced AI assistants and dashboards that could not retrieve relevant context. **Jira, Jira Service Management and Customer Service Management:** Search and experiences that depend on search like finding issues and agent responses in CSM remained available but with degraded performance in fallback mode. By 12:09 UTC, search indexes and search performance was fully restored from fallback to full capacity across all regions. **Guard Standard and Atlassian Administration:** Search functionality was unavailable for parts of the incident window. As a result, Domain Claims, usage tracking, and managed accounts were degraded for portions of the window. These services were restored to operational status by 07:33 UTC. Guard Premium was not impacted by this issue. **Confluence:** Search functionality was unavailable for parts of the incident window. Recovery began at 07:30 UTC as backend search clusters were restored. Full recovery, including search index replay, completed at 11:37 UTC. **Loom:** Search functionality and some experiences that rely on Confluence Search, such as sharing to spaces\) was unavailable for portions of the window and fully restored at 11:37 UTC. **Rovo and Rovo Dev:** Rovo agents remained responsive but experienced degraded functionality due to loss of search capabilities in underlying services. They were unable to reliably return context about work items or pages. Functionality was fully restored at 11:37 UTC. ### **ROOT CAUSE** Atlassian products rely on OpenSearch clusters to power their search capabilities including issue search, content search, and AI-powered search features. An infrastructure configuration change increased resource reservations \(CPU & Memory\) for a system component that runs across our compute platform. On a subset of clusters configured for high-density workloads, the increased reservations exceeded available node capacity. This caused search workloads to be evicted and, in some clusters, could not reschedule onto any available nodes impacting search functionality across affected products. The change was deployed across multiple production clusters in a short time frame, limiting the opportunity to detect the capacity conflict in a smaller subset of clusters before it reached the wider fleet. Automated scaling systems attempted to recover by provisioning additional capacity but in the worst‑affected clusters this led to runaway node scaling and exhaustion of available network resources, prolonging recovery time. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We understand that service disruptions impact your productivity. In addition to our existing testing and preventative processes, Atlassian is prioritizing the following actions to help reduce the likelihood and impact of similar incidents in the future and to speed up recovery when issues occur: * **Enforce smaller deployment cohorts and larger soak for critical platform changes for these cluster types** Implement smaller deployment cohorts, mandatory soak periods between environments, and automated health gates so that changes are validated on a limited set of clusters before being promoted more broadly. * **Strengthen automated pre‑deploy validation for resource changes** Add validation checks to ensure resource changes for system components are compatible with node capacity and reserved headroom, preventing system workloads from crowding out customer workloads. * **Improve post‑deploy verification and alerting** Enhance monitoring and post‑deployment verification to detect patterns such as spikes in pending pods, runaway node scaling, and low pod‑IP headroom closely correlated with new configuration being rolled out. * **Align autoscaling behavior with capacity and safety limits** Align autoscaling capacity calculations with node reservations and introduce safeguards and circuit breakers to prevent runaway scaling and to enforce safe limits on node and pod IP counts. * **Enhance recovery automation** Improve automation and runbooks so we can safely disable autoscaling, remove empty nodes in bulk, and restore normal operations faster across multiple clusters in parallel. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability and to reduce the risk and impact of similar issues in future. Thanks, Atlassian Customer Support

Degraded performance of Admin Hub, Atlassian Analytics, Confluence Cloud, Focus, Jira, Jira Product Discovery, Jira Service Management, and Rovo

minor

Dec 15, 2025 · resolved Dec 15

On December 15th, 2025, Admin Hub, Atlassian Analytics, Confluence Cloud, Ecosystem, Focus, Jira, Jira Product Discovery, Jira Service Management, and Rovo users in the prod-us-east region may have experienced performance degradations and errors on the web page and mobile apps. The issue has now been resolved, and the service is operating normally for all affected customers.

Atlassian Cloud Services impacted

major

Oct 20, 2025 · resolved Oct 21

### Postmortem publish date: Nov 19th, 2025 ### Summary All dates and times below are in UTC unless stated otherwise. Customers utilizing Atlassian products experienced elevated error rates and degraded performance between Oct 20, 2025 06:48 and Oct 21, 2025 04:05. The service disruptions were triggered due to an [AWS DynamoDB outage](https://aws.amazon.com/message/101925/#:~:text=1%3A50%20PM.-,DynamoDB,-Between%2011%3A48) and further affected by subsequent failures in [AWS EC2](https://aws.amazon.com/message/101925/#:~:text=service%20disruption%20event.-,Amazon%20EC2,-Between%2011%3A48) and [AWS Network Load Balancer](https://aws.amazon.com/message/101925/#:~:text=service%20disruption%20event.-,Amazon%20EC2,-Between%2011%3A48) within the us-east-1 region. The incident started at Oct 20, 2025 06:48 and was detected within six minutes by our automated monitoring systems. Our teams worked to restore all core services by Oct 21, 2025 04:05. Final cleanup of backlogged processes and minor issues was completed on Oct 22, 2025. We recognize the critical role our products play in your daily operations, and we offer our sincere apologies for any impact this incident had on your teams. We are taking immediate steps to enhance the reliability and performance of our services, so that you continue to receive the standard of service you have come to trust. ### IMPACT Before examining product-level impacts, it's helpful to understand Atlassian's service topology and internal dependencies. Products such as Jira and Confluence are deployed across multiple AWS regions. The data for each tenant is stored and processed exclusively within its designated host region. This design is intentional and represents the desired operational state, as it limits the impact of any regional outage strictly to tenants in-region, in this case us-east-1. While in-scope application data is pinned to the region selected by the customer, there are times when systems need to call other internal services that may be based in a different region. If a problem occurs in the main region where these services operate, systems are designed to automatically fail over to a backup region, usually within three minutes. However, if unexpected issues arise during this failover, it can take longer to restore services. In rare cases, this could affect customers in more than one region. It’s important to note that all in-scope application data for supported products is pinned according to a customer’s chosen region. **Jira** Between Oct 20, 2025 06:48 and Oct 20, 2025 20:00, customers with tenants hosted in the us-east-1 region experienced increased error rates when accessing core entities such as Issues, Boards, and Backlogs. This disruption was caused by AWS's inability to allocate AWS EC2 instances and elevated errors in AWS Network Load Balancer \(NLB\). During this window, users may also have observed intermittent timeouts, slow page loads, and failures when performing operations like creating or updating issues, loading board views, and executing workflow transitions. Between Oct 20, 2025 08:36 and Oct 20, 2025 09:23, customers across all regions experienced elevated failure rates when attempting to load Jira pages. This disruption was caused by the regional frontend service entering an unhealthy state during this specific time interval. Normally, the frontend service connects to the primary AWS DynamoDB instance located in the us-east-1 to retrieve the most recent configuration data necessary for proper operation. Additionally, the service is designed with a fallback mechanism that references static configuration data in the event that the primary database becomes inaccessible. Unfortunately, a latent bug existed in the local fallback path. When the frontend service nodes restarted, they were unable to load critical operational configuration data from primary or fallback sources, leading to the observed failures experienced by customers. Between Oct 20, 2025 06:48 and Oct 21, 2025 06:30, customers experienced significant delays and missing Jira in-app notifications across all regions. The notification ingestion service, which is hosted exclusively in us-east-1, exhibited an increased failure rate when processing notification messages due to AWS EC2 and NLB issues. This issue resulted in notifications being delayed - and in some cases, not delivered at all - to users worldwide. **Jira Service Management \(JSM\)** JSM was impacted similarly to Jira above, with the same timeframes and for the same reasons. Between Oct 20, 2025 08:36 and Oct 20, 2025 09:23, customers across all regions experienced significantly elevated failure rates when attempting to load JSM pages. This affected all JSM experiences including the Help Centre, Portal, Queues, Work Items, Operations, and Alerts. **Confluence** Between Oct 20, 2025 06:48 and Oct 21, 2025 02:45, customers using Confluence in the us-east-1 region experienced elevated failure rates when performing common operations such as editing pages or adding comments. The primary cause of this service degradation was the system's inability to auto-scale due to AWS EC2 issues to manage peak traffic load effectively. Though the AWS outage ended at Oct 20, 21:09, a subset of customers continued to experience failures as some Confluence web server nodes across multiple clusters remained in an unhealthy state. This was ultimately mitigated by recycling the affected nodes. To protect our systems while AWS recovered, we made a deliberate decision to enable node termination protection. This action successfully preserved our server capacity but, as a trade-off, it extended the time required for a full recovery once AWS services were restored. **Automation** Between Oct 20, 2025 06:55 and Oct 20, 2025 23:59, automation customers whose rules are processed in us-east-1 experienced delays of up to 23 hours in rule execution. During this window, some events triggering rule executions were processed out of order because they arrived later during backlog processing. This caused potential inconsistencies in workflow executions, as rules were run in the order events were received, not when the action causing the event occurred. Additionally, some rule actions failed because they depend on first-party and third-party systems, which were also affected by the AWS outage. Customers can see most of these failures in their audit logs; however, a few updates were not logged due to the nature of the outage. By Oct 21, 2025 5:30, the backlog of rule runs in us-east-1 was cleared. Although most of these delayed rules were successfully handled, there were some additional replays of events to ensure completeness. Our investigation confirmed that a few events may never have triggered their associated rules due to the outage. Between Oct 20, 2025 06:55 and Oct 20, 2025 11:20, all non-us-east-1 regional automation services experienced delays of up to 4 hours in rule execution. This was caused by an upstream service that was unable to deliver events as expected. The delivery service encountered a failure due to a cross-region dependency call to a service hosted in the us-east-1 region. Because of this dependency issue, the delivery service was unable to successfully deliver events throughout this time frame, resulting in customer-defined rules not being executed in a timely manner. **Bitbucket and Pipelines** Between Oct 20, 2025 06:48 and Oct 20, 2025 09:33, Bitbucket experienced intermittent unavailability across core services. During this period, users faced increased error rates and latency when signing in, navigating repositories, and performing essential actions such as creating, updating, or approving pull requests. The primary cause was an AWS DynamoDB outage that impacted downstream services. Between Oct 20, 2025 06:48 and Oct 20, 2025 22:46, numerous Bitbucket Pipeline steps failed to start, stalled mid-execution, or experienced significant queueing delays. Impact varied, with partial recoveries followed by degradation as downstream components re-synchronized. The primary cause was an AWS DynamoDB outage, compounded by instability in AWS EC2 instance availability and AWS Network Load Balancers. Furthermore, Bitbucket Pipelines continued to experience a low but persistent rate of step timeouts and scheduling errors due to AWS bare-metal capacity shortages in select availability zones. Atlassian coordinated with AWS to provision additional bare-metal hosts and addressed a significant backlog of pending pods, successfully restoring services by 01:30 on Oct 21, 2025. **Trello** Between Oct 20, 2025 06:48 and Oct 20, 2025 15:25, users of Trello experienced widespread service degradation and intermittent failures due to upstream AWS issues affecting multiple components, including AWS DynamoDB and subsequent AWS EC2 capacity constraints. During this period, customers reported elevated error rates when loading boards, opening cards, adding comments or attachments. **Login** Between Oct 20, 2025 06:48 and Oct 20, 2025 09:30, a small subset of users experienced failures when attempting to initiate new login sessions using SAML tokens. This resulted in an inability for those users to access Atlassian products during that time period. However, users who already had valid active sessions were not affected by this issue and continued to have uninterrupted access. The issue impacted all regions globally because regional identity services relied on a write replica located in the us-east-1 region to synchronize profile data. When the primary region became unavailable, the failover to a secondary database in another region failed, which delayed recovery. This failover defect has since been addressed. **Statuspage** Between Oct 20, 2025 06:48 and Oct 20, 2025 09:30, Statuspage customers who were not already logged in to the management portal were unable to log in to create or update incident statuses. This impact was restricted only to users who were not already logged in at the time. The root cause was the same as described in the Login section above, and it was resolved by the same remediation steps. ### REMEDIAL ACTION PLAN & NEXT STEPS We have completed the following critical actions designed to help prevent cross-region impact from similar issues: * Resolved the code defect in the fallback option to ensure that Jira Frontend Services in other regions remain unaffected during a region-wide outage. * Fixed the issue that prevented timely failover of the identity service which impacted new login sessions. * Resolved the code defect so that delivery services in unaffected regions remain operational during region-wide outages. Additionally, we are prioritizing the following improvement actions: * Implement mitigation strategies to strengthen resilience against region-wide outages in the notification ingestion service. Although disruptions to our cloud services are sometimes unavoidable during outages of the underlying cloud provider, we continuously evaluate and improve test coverage to strengthen resilience of our cloud services against these issues. We recognize the critical importance of our products to your daily operations and overall productivity, and we extend our sincere apologies for any disruptions this incident may have caused your teams. If you were impacted and require additional details for internal post-incident reviews, please reach out to your Atlassian support representative with affected timeframes and tenant identifiers so we can correlate logs and provide guidance. Thanks, Atlassian Customer Support

Get alerted when Atlassian Analytics goes down

Alert24 monitors Atlassian Analytics and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.

Start free — no credit card

More Databases & Data Platforms status pages