• Home
  • My Favorite Topics
    • Blogging
    • Business
    • Career
    • Cars
    • Consulting
    • Epic Life Quest
    • Iceland
    • Marketing
    • Presenting
    • Productivity
  • My Life Quest
    • Future Achievements
  • About Me
  • My Recent Photos

Book Review: Google’s Site Reliability Engineering #tsql2sday

6 years ago
book review, devops, site reliability engineering
No Comments

For this month’s T-SQL Tuesday, Grant Fritchey’s topic is Databases and DevOps.

Summary: you should skim this free online book to see inspiring ideas of how administration works at scale, although don’t expect to put the practices into place without management buy-in.

Let’s get one thing out of the way first:
you, dear reader, probably don’t work at Google scale.

Google faces similar problems that your employer does, but just at a different quantity. Instead of keeping a 3-server applications reliable, they need to keep 3,000-server applications reliable. As a result, they have very different budgets than you do, and it gives them the luxury to treat it as a serious discipline.

Free Site Reliability Engineering book

In their free Site Reliability Engineering book, they share some of the lessons they learned about:

  • Designing service level objectives
  • Tactics like deployments, monitoring, automation, and release engineering
  • How to load balance and handle overload, and much more

You really don’t need to read the whole book – just skim it, and you’ll take away interesting concepts and stories. While DevOps and SRE aren’t the same thing, you’ll start to see how your DBA duties, DevOps duties, and developer duties all blend together to work towards the same business goals.

My favorite concept: error budgets

Say you’re given a 99.5% uptime goal. Instead of thinking of it in terms of time, think of it as, “0.5% of my service’s requests may result in errors.” Maybe it’s the entire service is unusable, maybe it returns a failure of some kind, maybe it times out.

Instead of aiming for 0% errors, aim for 0.5% or less errors.

0.5% is your error budget, and you’re expected to spend it.

You may spend it on planned outages for software deployments, spend it accidentally in the form of unplanned outages, or purposely for things like patching or major app code deployments, or maybe just plain old experimenting by cutting costs. This starts to set the stage for why we need DevOps – developers want to spend part of the error budget on deployments.

DBAs usually aim for zero errors. DBAs don’t want to spend the error budget at all, but the business needs us to. If you don’t use any of your error budget, that’s a problem because it indicates that you’re probably spending too much (money or resources), not doing the right upgrades/patching to keep your application current, or trying to make developers work too hard to build absolutely perfect deployments every time (which cost a ton of money to build). You should probably look at ways you could cut costs in order to get closer to the business’s objectives.

Prefer videos? They’ve got videos too.

Check out the reliability engineering talk from Google Cloud Next 2017:

I’m certainly not saying that Google does everything right, and that you should model all of your practices after theirs. That’s ludicrous – they’re huge, and they have huge budgets. But there’s some interesting lessons about your own database operations and deployments.

book review, devops, site reliability engineering
Previous Post
I Failed 13 College Courses.
Next Post
What #DataSaturday Organizers and Volunteers Do

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Hi. I’m Brent.

That's me, Brent.

I live in Las Vegas, Nevada. I'm on an epic life quest to have fun and make a difference.

I co-founded Brent Ozar Unlimited to help make your SQL Server go faster. I also maintain sp_Blitz® and the open source First Responder Kit repo.

My current car collection includes a Jaguar XKR-S, Porsche 944 Turbo, Porsche 356 Speedster replica, and a Ferrari 328 GTS.

profile for Brent Ozar on Stack Exchange, a network of free, community-driven Q&A sites

© 2021 Brent Ozar, all rights reserved. Privacy Policy

  • Home
  • My Favorite Topics
  • My Life Quest
  • About Me
  • My Recent Photos