On Monday, April 22nd between 6:47PM UTC and 7:25PM UTC an issue occurred with internal database connections. This prevented some of our internal services from correctly connecting to our internal database.
During the outage, API requests returned an error status and dbt Cloud was effectively unavailable across the United States region. This also caused Semantic Layer connections to error out.
We apologize for the outage and to every affected customer. We are making improvements to our configuration and deployment process to ensure that a similar outage is unlikely in the future.
The root cause of the issue was a configuration change which altered the way dbt Cloud and a handful of other services connected to our internal database. The intent of the change was to force services to connect to our internal database through a connection pooler. A preexisting configuration value in the United States region conflicted with the change and caused the configuration to be applied incorrectly. This ultimately caused the internal database connection to be established with an invalid set of credentials.
Our existing alerting helped us identify an increased error rate on our internal services. We were able to correctly identify a recent deployment that triggered the outage and began the rollback procedure. As the rollback was deployed, the error rate dropped and services resumes normal operation.