Experience degraded performance for SSO
Incident Report for WorkOS
Postmortem

From 2023-08-28 23:22 UTC to 2023-08-29 02:22 UTC WorkOS’s SSO product was unavailable to users through custom domains. Requests returned HTTP 403 Forbidden errors.

We understand that WorkOS sits on a critical path for our customers’ applications. This is not a responsibility we take lightly and this outage is not in line with the service we aim to provide. We are taking all necessary steps to ensure an incident like this does not happen again.

Who was affected?

The incident affected SSO API requests for users with custom hostnames. Affected requests during this time resulted in 403 errors and displayed the error message “This web property is not accessible via this address.”

What happened?

While performing maintenance on our Web Application Firewall, a new set of rules were applied to production. This change marked some legitimate requests as anomalous. Alerts were not properly configured to notify engineers when seeing a spike in anomalous 4XX traffic.

The main factor that led to this incident was improper controls around how production Web Application Firewall changes should be applied.

What will we do to mitigate problems like this in the future?

Moving forward, WorkOS will take the following actions:

  1. Establish additional access control policies around applying Web Application Firewall changes.
  2. Add monitoring around increases in anomalous traffic.
  3. Add monitoring around failures with custom hostnames.
Posted Sep 08, 2023 - 18:58 EDT

Resolved
This issue has been resolved.
Posted Aug 28, 2023 - 22:59 EDT
Monitoring
Services have returned to normal and we are continuing to monitor the situation.
Posted Aug 28, 2023 - 22:43 EDT
Identified
We've identified the issue and are working on a resolution.
Posted Aug 28, 2023 - 22:33 EDT
Update
We are experiencing a partial outage of SSO. We are currently investigating and will update when we identify the issue.
Posted Aug 28, 2023 - 22:12 EDT
Investigating
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
Posted Aug 28, 2023 - 22:11 EDT
This incident affected: Core Services (SSO).