You’ve backed up your data. You’ve secured your application code and infrastructure as code. You’ve documented runbooks. But when Google accidentally deleted a customer’s entire cloud environment for a week, or when Change Healthcare’s partners refused connectivity for 90 days after a breach, backups alone weren’t enough. The real objective is recovering quickly and completely—data, infrastructure, networking, configuration, and security principals—all together. In this talk, you’ll learn the requirements for fast, reliable recovery including: cross-region and cross-account isolation, complete workload recovery, and testability. We’ll walk through the design decisions you need to make and how can implement them using cloud native tools. All while applying DevOps principles—automation, culture, lean practices, measurement, and sharing—to reduce toil, and test your DR strategy safely and frequently.
