Write-up
Some users experiencing "IDE Requires Restart"
Summary

Between November 20, 23:42 UTC and November 21, 02:23 UTC, customers with IP Restriction enabled on their dbt Cloud accounts were unable to launch Cloud IDE (Studio). Instead, the IDE would attempt to start and immediately restart in a loop. Existing sessions remained active and all other dbt functionality continued to operate normally. In total, 47 customer accounts across multiple regions were affected, with users unable to create new Studio sessions during this period.

Impact
  • Affected Services: dbt Cloud IDE (Studio) session initialisation for accounts with IP Restriction enabled

  • Duration: November 20, 23:42 UTC to November 21, 02:23 UTC (approximately 2 hours 41 minutes of active impact)

  • Customer Impact:

    • 47 customer accounts were unable to start new Cloud IDE sessions

    • Users experienced the IDE being stuck in a continuous restart loop when attempting to access their development environments

    • Some accounts experienced hundreds of failed session attempts (ranging from 1 to 478 failed attempts per account)

    • Existing IDE sessions that were already running continued to work normally

    • No data loss or configuration changes occurred

    • All other dbt Cloud functionality (scheduled runs, API, jobs) remained unaffected

We sincerely apologise for this disruption to your development workflow. We understand the critical importance of IDE availability for your data transformation work and are implementing multiple improvements to prevent similar issues.

Root Cause

The root cause was a deployment on November 19 as part of a project to harden the overall platform security which impacted how the Studio startup script determines the request context for initializing IDE sessions. Because IP Restriction requires validating the connecting IP, the missing value caused the validation to fail. The IDE automatically retried session creation, which led to a continuous restart loop for affected accounts.

This issue did not immediately trigger alerts because:

  • The number of impacted accounts was relatively small compared to overall IDE startup volume

  • The failure mode produced persistent retries rather than a high short-term error spike

Next Steps or Lessons Learned
Remediation or steps we have already taken
  • Immediate Fix: Deployed a hotfix at 02:23 UTC that restored proper IP address handling while maintaining the required project improvements

  • Verification: Manually tested and confirmed all affected accounts could successfully create IDE sessions

  • Enhanced Logging: Added improved logging for IP extraction failures to aid faster diagnosis

Planned Remediation

Immediate Actions (Within 2 days):

  • Add comprehensive unit tests covering IP Restriction rules in the IDE startup script

  • Implement validation suite for Studio startup that runs automatically after each deployment

Short-term Improvements (Within 30 days):

  • Implement graceful error handling for Studio session startup failures to prevent restart loops

  • Add specific alerting rules to detect sustained Studio startup error rates, even for low volume scenarios

Long-term Prevention (Ongoing):

  • Establish mandatory test coverage requirements for all RequestContext related changes

  • Implement feature flags for all authentication and permission related deployments

  • Enhance monitoring to segment error rates by specific features (like IP Restriction) rather than just overall rates

We are committed to improving our deployment practices and monitoring capabilities to ensure the reliability of all dbt Cloud features, regardless of adoption rate. Thank you for your patience during this incident.