Fearless Big Data

Databases are intrinsically hard. They contain/represent so much business value that the risks of downtime or other major incidents are, in some cases, catastrophic. With so much associated risk, many organizations find themselves stuck in legacy management practices for them: non-representative staging environments, manually applied upgrades, complicated multi-step schema change rollouts… It’s not pretty out there!

What if, instead, our databases were as well-managed as the rest of our production infrastructure? What if we had a delivery pipeline that not only gave us the confidence to fearlessly roll changes into production, but also made experimentation simple? What if our databases felt… nimble?

Here at the federal government’s Centers for Medicare & Medicaid Services’ Blue Button 2.0 API program, we’re blessed/cursed with a legitimately large database: 8 TB. Terrifyingly, we’re now evaluating proposals that would grow its total size to almost 50 TB. Is that a good idea? It needs to be!

In order to get there, we’ve put together some conceptually simple principles and changes. These will drastically improve our database management: evolving it from a burden into an opportunity. This talk will detail those principles and our implementation efforts towards them:

  1. Realistic Disaster Recovery
  2. Test Data Management
  3. Worry-Free Schema Migrations
  4. Trouble-Free Data Ingestion




Karl Davis


Karl has been a software engineer for over a decade now and, amazingly, still thinks that computers are great. People — and the software they create — are trickier, but not entirely hopeless.