On Friday, August 1st, 2025 around 14:10 UTC, we experienced an authentication outage affecting login functionality for all customers in AWS US environments. The incident was caused by a DNS configuration change made during architectural updates. After identifying the issue, authentication services were fully restored across all affected environments by 15:35 UTC.
During the incident, customers in the affected US AWS environments experienced the following issues:
Authentication Services: Users were unable to complete standard login flows in the dbt platform.
Account Management: Users could not access or modify account credentials.
At 14:10 UTC, dbt Labs monitoring detected elevated authentication failures in the affected U.S. environments. At 14:25 UTC, our Support team began receiving customer reports of login issues. The incident was promptly escalated to the highest severity, and our platform team began investigating immediately.
Our investigation quickly determined that the authentication disruption stemmed from an internal DNS configuration change. Once the root cause was identified, engineers restored the DNS record to the correct value. Full service restoration was achieved by 15:35 UTC, with authentication functionality verified across all affected environments.
The outage was caused by an overwrite of DNS records related to our authentication services during architectural updates. During the process, a configuration setting was inadvertently copied between environments, which affected the authentication service settings. This change disrupted the authentication used across all US AWS environments.
❓Why did this issue only affect US AWS environments?
The DNS configuration that was incorrectly applied impacted authorization endpoints used by our US AWS environments. Other regions use different authentication configurations that were not affected by this change.
❓Why were all US AWS regions impacted simultaneously?
When the DNS override was applied, it affected the authentication endpoints used by our US AWS environments.
To prevent similar issues in the future, we are implementing the following changes:
Monitoring Improvements: We experienced a longer-than-optimal resolution time for this incident. We are enhancing our alerting and testing systems for our authentication services to detect login disruptions sooner which will reduce response time.
Environment Isolation: We are enhancing our processes to isolate authentication configurations between environments.
Enhanced Testing: Strengthening testing procedures for authentication services to ensure login functionality is verified after any configuration changes.
Configuration Validation: We have updated our documentation to include explicit warnings about authentication services override values and their regional implications.
Resource Protection: We are implementing additional safeguards to lock critical DNS and authentication resources to prevent overwrites.
We sincerely apologize for this authentication outage and understand the impact it had on your teams' ability to access dbt services. We hold ourselves accountable to extremely high standards of reliability, and we take our responsibility as stewards of your data and workflows very seriously.
Our teams have already implemented immediate corrective actions and are rapidly deploying additional safeguards to protect against similar events. We are committed to continuously improving our operational excellence, ensuring minimal disruption and quick recovery should future issues arise.
Thank you for your continued trust and partnership. If you have any questions or need further assistance, please contact us directly at support@getdbt.com.