Steve Jones asked data professionals to cover four days in our lives, so I blogged about a normal non-client-facing day, and then an unproductive day.
Today is different: I’m client-facing. We specialize in a 3-day SQL Critical Care:
- Day 1 – we meet with the client to dig through their SQL Server together while we ask them questions about their database, indexes, queries, designs, RPO/RTO goals, hardware, and more.
- Day 2 – we split up. The client goes back to doing their thing, and we write the findings for them. (You can see examples of the findings on that page above.)
- Day 3 – we meet again to deliver the findings, which are a mix of consulting and training. We tell you the fastest way to relief for the pains your SQL Server is facing, and make sure you’re confident in moving forward. (Sometimes clients hire us to fix the problems directly, too, but ideally we’d rather show you how to fix the pain permanently rather than get hooked on expensive
pain relief narcoticsconsulting.)
Today is day 1. Let’s make the magic happen.
6:00AM-6:30 – Emails. Small catch-up stuff.
6:30-7:30 – Reading and learning. Not a lot of RSS stuff from overnight (no surprise, since it’s Monday morning) but Hacker News has a really interesting discussion: The Secret Life of an Autistic Stripper. Forget the article and the inflammatory title – the comments are just interesting to read because there’s such a wide spectrum of personalities in the IT community. Answered a DBA.se question about reading the MySQL transaction log. Happy to see the photos coming in from the Chicago-Mac race, thinking back to when I did it. Saddened to read of a sailor who fell overboard and went missing, and his life vest didn’t inflate. That race is no joke.
7:30-8:00 – Breakfast. Coffee and yogurt. While eating, drool over a cocaine-tastic 1986 911.
8:00-8:15 – Email Richie. He was out last week, so I catch him up to speed on the Github issues I filed over the last week. We have 117 open issues at the moment with a variety of strategic and tactical stuff, pull requests, etc., and I’m sure he’s got a ton of notification emails, so I wanted to help prioritize stuff.
8:15-8:45 – Review client DMV data. Clients run an app that sends us an Excel spreadsheet with data from sp_Blitz, sp_BlitzCache, sp_BlitzIndex, etc. sliced and diced a few different ways – like their plan cache sorted by different metrics – plus a lot of their query plans. I spent some time with it last week, but I want to refresh my memory.
9:00-10:00 – Call starts, talk strategy. We talk about their pain points – the most important issues they want solved during the engagement – and I talk about a few big-picture challenges. For example, this client has a ~10TB database, and in the event that someone has an “oops” query like dropping a table, they want to be able to recover within 1 minute of data loss in under 1 hour of downtime.
10:15-12:00 – Analyzing query memory grants and query plans. We were called in for unpredictable, random performance slowdowns. Thanks to my research into the client’s data ahead of time, I know someone’s running queries that get a really big memory grant, but then don’t use it, and finish quickly. We run sp_BlitzCache @SortOrder = ‘memory grant’ and catch a few – including this gem where SQL Server’s cardinality estimation has gone haywire:
To be clear, it’s not really a bad query: SQL Server’s just making some really questionable guesses about how many rows will come back. In actuality, only a couple hundred thousand rows come back. We talk about the source of those queries, how to tune them, and whether we can offload them to a different server. The server’s other bottleneck is queries that go parallel to burn CPU power, so we start with sp_BlitzCache @SortOrder = ‘cpu’ and walk through some of the top offenders.
12:00-1:00 – Lunch break. Erika made chicken chili.
1:00-2:00 – Analyze VMware configuration & performance. There are multiple ways you can fix a CPU bottleneck, and the goal of the SQL Critical Care® is to find the right one for a given client. This VM’s wide NUMA configuration made things tough for VMware’s CPU scheduling, and I wanted to find out why it was built that way, and if the team was open to changing it. To learn more about this topic, check out Frank Denneman’s excellent NUMA Deep Dive series.
2:15-3:00 – Test trace flag 2335 behavior. The current production server was running this trace flag, and nobody knew why. The resulting query plans showed some really odd memory grants. We took one of the queries, then took it to a QA server where we could flip this trace flag on & off. The devil’s always in the details on that kind of thing. I’ve got the data I need to build their findings, so I bid them adieu.
After client calls finish, I need to walk away from the data for a while before I start assembling findings. I find that if I just start writing up their deliverables right away, I can’t see the forest for the trees. I do one final set of notes in the client’s file, then close it all down for the day.
3:00-3:30 – Emails & reading. I start by getting back to inbox zero, then check Feedly for updated blog posts.
3:30-4:00 – Testing ConstantCare.exe. Richie being the development machine that he is, he’s already implemented a few of my notes from this morning, and needs me to test an updated build in preparation for an early access build for clients.
And I’m out! Very productive day. Tomorrow I’ll work on their findings, and then deliver ’em on Wednesday. Thursday & Friday, I’m proctoring Drew Furgiuele’s PowerShell class while updating my slides for my upcoming Mastering Server Tuning class. Next week, I’ll be out of the office as we move to San Diego.
I do have a few other styles of days, and I’ll blog a couple of those in August:
- Mentoring days – when I analyze SQL ConstantCare® client data in Power BI and send them advice emails
- Onsite days – when I fly to a client’s office, assess their servers, and then teach training classes relevant to the problems they’re facing
- Teaching days – when I host a class (but that one doesn’t lend itself well to the blogging)