Current Status
All Systems Operational
Components
Recent Incidents
Custom Transformation Failure During Materialization
majorMay 29, 2026 · resolved May 29
This morning, we experienced an issue with the execution of custom transformations during the final materialization phase of our data pipeline. Custom transformations are utilized by approximately 22% of our integrations. Integrations without custom transformations were not impacted. Integrations that utilized "Edlink-defined" transformations (such as "Infer Roles") were not impacted. Timeline of Events - Our transformation execution relies on private NPM packages in our NPM organization. - Around 3:12am CT, the first custom transformation failed because it was unable to load packages from NPM. - The first client issues were reported around 8:30am CT, but it was not apparent that this was a widespread phenomenon until approximately 9:50am CT when it was escalated to P1. - We resolved the issue at approximately 10:10am CT, but due to latency, materializations did not resume until approximately 10:23am CT. Root Cause The root causes was related to an expired credit card. After a failed payment, NPM unexpectedly downgraded our account and we were no longer able to access our own packages. We re-upgraded the account and after some brief latency due to their billing system, we are able to access packages again. Changes Going Forward 1. In response, we plan to introduce additional alerting to detect unusual changes in our materializations. 2. We are also seeking alternatives to NPM for holding our private packages for materialization execution. Our use of NPM was a technical requirement when we first implemented our custom transformation requirement, but that may no longer be the case.
Instructure Data Breach & Product Outage
criticalMay 7, 2026 · resolved May 8
You may know by now that Instructure was breached by bad actors twice in the last week. At this time, we have no reason to believe that our clients \(or Edlink itself\) was compromised. However, out of an abundance of caution and as a best practice, to have all districts using Canvas to rotate their API and LTI keys. Here's why: While it is Instructure’s current position that there’s “no evidence” that any data was accessed,we don’t share their level of conviction. This assumption is based on the fact that attackers hijacked substantially all production Canvas instances to show a ransom message, which would require a fairly high level of access to Instructure’s infrastructure and systems. As a result, if a bad actor did get access to Instucture core systems, they'd likely have access to API and LTI keys. For API & LTI 1.3 integrations, exposed keys mean that attackers can exfiltrate any data that those keys have access to. Attackers can act as your application and use end user tokens to retrieve data. It is unlikely that attackers will be able to impersonate Canvas users to sign into your platform, assuming you have correctly implemented OAuth 2.0 or OIDC \(for LTI\). For LTI 1.1 integrations, exposed keys means that attackers can potentially sign into your product as “legitimate” end users. This can lead to possible data exfiltration from your product and it will be difficult or impossible to tell if traffic is legitimate. As such, we recommend that you immediately rotate all LTI 1.1 keys or upgrade to LTI 1.3, if possible. In either case, the attackers could “sit” on stolen credentials for months or years before they decide to use them. By the time they do make their move, this incident may be a distant memory and it will be unclear to those affected exactly how the unauthorized access was obtained. Later today or early tomorrow, we will release a user interface that we can share with you to share with your Canvas school customers to make the key rotation process as seamless as possible for school IT admins. If you'd like to rotate keys sooner, we would be happy to work directly with you or your schools to knock this out.
Degraded API Performance
noneMay 4, 2026 · resolved May 4
**Date of Incident:** May 4, 2026 **Duration:** 12:07 PM – 1:04 PM \(57 Minutes\) **Impact:** Total interruption of user login capabilities. **Status:** Resolved ### **Executive Summary** Between 12:07 PM and 1:04 PM today, Edlink experienced a severe API degradation that prevented substantially all users from logging into client products via SSO. The incident was caused by database connection starvation, which was triggered by a service deployment that utilized an older pinned version of the PostgreSQL \(PG\) library. The issue was mitigated by bringing down the problematic service to free up connections, followed by a full revert of the change. Normal operations have been fully restored. ### **Impact** * **User Experience:** Substantially all users attempting to access Edlink during the 57-minute window were completely unable to log in or authenticate. A small portion were able to authenticate normally. * **System Impact:** Critical services, including the authentication/login service, were starved of database connections and timed out or dropped requests. ### **Root Cause** The incident was traced back to a deployment that occurred today at 12:07pm CT where a part of our service was upgraded. **Background on Dependency Version Pinning:** To provide context on _why_ an older version was in use, we recently implemented a strict policy for managing our software dependencies. This proactive security measure was enacted following an industry-wide supply chain attack incident in March of 2026. Edlink was not affected by this incident directly, but we determined that the attack vector was not one that we were comfortable with, and as such, we moved to proactively update our systems. To safeguard Edlink's platform and ensure our systems remain secure, we instituted a policy to "pin" or lock all our software dependencies to explicitly verified, “safe” versions. This prevents our systems from automatically downloading new, potentially unverified updates. This is the attack vector that was used in the industry-wide attack a few weeks back. While this policy is crucial for preventing malicious, unverified updates from automatically infiltrating our systems, in this specific instance, it had an unintended side effect. In this situation, it forced our deployment to utilize an older version of the PG library than we had intended. This older version of the library managed Postgres connections highly inefficiently and would leak unused connections. Upon deployment, the service rapidly consumed the available database connection pool. Because the connections were not being properly released or managed, other critical infrastructure \(notably the service handling user logins\) was left without available database connections, leading to the system-wide login failures. ### **Timeline** * **12:07 PM:** The deployment containing the outdated PG library goes live. Database connection pools immediately begin to spike. Users start experiencing login failures. * **~12:45 PM:** Emergency mitigation is enacted: the problematic service is brought down, immediately releasing its held connections back to the pool and restoring baseline functionality for the login service. * **1:04 PM:** The deployment is officially reverted to the previous stable state. System stability is confirmed, and the incident is closed. ### **Resolution and Recovery** The immediate bleeding was stopped by intentionally spinning down the problematic service, which allowed the authentication services to recover. The underlying code change was then completely reverted in version control and redeployed to ensure the service could be brought back online safely without causing a secondary outage. ### **Preventative Measures** To ensure we can more quickly identify and resolve similar issues in the future, we have integrated additional monitoring tools focused on detecting failed logins. This will allow us to immediately detect authentication issues and mitigate them before they cause a prolonged system-wide impact. Additionally, we have conducted a review of any other dependency packages that may have been inadvertently downgraded during this “pinning” process.
Meta API request timeouts
criticalMar 2, 2026 · resolved Mar 2
For 172 seconds, our meta API (including SSO endpoints) were unresponsive. This incident took place around 2:30pm CT. The incident was caused by a lock on a table related to single sign on. The lock was the result of a database maintenance query that we were running at the time. The query was expected to complete in less than one second (and therefore, not cause any service disruption), but maintained the lock longer than expected. We terminated the execution of the query approximately two minutes later.
12/27 - Scheduled Downtime
maintenanceDec 28, 2025 · resolved Dec 28
The maintenance is complete and we're back to normal API operations.
Get alerted when Edlink goes down
Alert24 monitors Edlink and 3,700+ other cloud and SaaS providers. When an outage is detected, it updates your status page automatically and pages your on-call team. No manual updates at 2 AM.



