The Problems with Remote DBA Companies Today

First things first, I don’t want to be your remote DBA. No thank you, unsubscribe. I never want to be on call again. From 1993 until 2008, I had an electronic tether – first in the hotel & restaurant industry, then IT. When I went to work for Quest Software as an evangelist in 2008, one of the biggest job benefits was being able to not answer the phone on weekends and holidays. Oh, sweet personal life, how I missed thee.

But I get tempting emails from companies that usually go like this:

“We’ve got a few database servers that are mission-critical. We don’t have enough work to keep a full time DBA busy, so we’d like to save money by hiring a part-time remote DBA. We just want them to remote in for several hours a week, make sure the trains are running on time, and then when things break, they can jump in and troubleshoot it.”

It all sounds so easy, right? Just set up the servers with a few good scripts, and you can just keep cashing checks. After all, DBAs don’t work that hard, right? (It always looks that way from the outside.)

Those last few words are the killer: “when things break, they can jump in and troubleshoot it.” Remember, the company’s database server is mission-critical, and when it’s down, they want someone quick, fast, and in a hurry.

That only works if the consultant isn’t tending to anyone else’s emergency.

So you end up with a few kinds of remote DBA firms:

Remote DBA Option 1: The Lone Gunman

A DBA quits his job and starts doing contracting full time. He wasn’t that busy at his old company anyway, so he makes them an offer they can’t refuse: he’ll keep the lights on for them, and he’ll remote in for emergencies. He’s desperate to get his first client in, so he lowballs them on a retainer and hourly fee. He’s now managing a few servers for a single client.

From my remote DBA days

From my remote DBA days

Over time, he adds more clients with more servers, but because his rates are really low, it takes him quite a while to build up enough momentum to keep the bills paid reliably. Emergencies don’t strike all that often, and he’s able to keep juggling these balls in the air to keep everybody happy. Clients love him because he’s way cheaper than a full time DBA, he’s smart, he knows their servers well, and he always answers the phone when there’s a problem.

Until that one day when he can’t.

Maybe he’s on the phone with someone else, or he’s at a ball game with a dead phone, or he’s on vacation out of the country, or his block has a power outage. He’s only human, and he’s only one human.

It doesn’t happen often, and some companies are okay taking that gamble.

Remote DBA Option 2: The Global Team

A company specializes in remote DBA work, and they hire a team of people. You can’t make the financial numbers work on this business with just senior people, because senior people want to get out of the on-call game. You have to pay senior people serious money if you want them to do on-call full time without burning out. Instead, they hire a few tiers of different skills from junior DBA to senior DBA. They set up on-call rotations, plus escalations so that if a junior person can’t solve a problem, they can bring in a senior person.

To make the juniors’ jobs easier, they build standardized runbooks that lay out formal procedures on how to handle a failed backup, a stopped SQL Server service, a drive that’s out of space. When the firm brings in a new client, they have a process for that too, vetting the server and getting the client to approve the runbook.

The remote DBA company’s entire business is based around standardizing processes so that any member of their DBA teams can step in and handle a call.

And that’s exactly what happens.

When you call a remote DBA firm, ideally you’ll get your primary contact – just as you get the Lone Gunman when you call him – but that’s not going to happen when the server breaks after hours, on weekends, or on holidays. The bigger the DBA firm – the more successful they get – the more likely you are to talk to someone who’s only seen your systems once or twice, if ever. It’s a qualified voice (hopefully), but it’s not a familiar one.

It’s also more expensive – because building this kind of around-the-clock support team, training them, managing them – it all costs money. This firm will be more expensive than the Lone Gunman, and I’ve seen cases where it’s been more expensive than just hiring a full time DBA. However, you could argue that it’s better than a full time DBA because it includes around-the-clock support at all times, even when your own Lone Gunman FTE is off on vacation.

Remote DBA Option 3: Cloud Services

The marketing behind cloud services – and I mean things like Windows Azure SQL Database and Amazon RDS, not virtual machines running plain old SQL Server – says that the cloud vendor does the lights-on support for you.

They do this by simply controlling the platform entirely. You don’t get the option to remote desktop into your WASD instance and apply patches. You don’t get involved in HA or DR failovers. When things break, the vendor fixes them.

In theory, it’s way cheaper and much more flexible. You solve multiple problems at once – not only did you have too little work to keep a full time DBA busy, but honestly, you probably had too little work to keep a properly configured SQL Server busy full time, too. With well-designed resource sharing and the economies of scale, you can get the right amount of database power and people that you need.

Except today, companies don’t want to move their databases to the cloud just to solve the part-time-remote-DBA problem. Today, they’re still too focused on keeping the rest of their infrastructure in-house, so moving everything to the cloud just because they can’t keep a DBA busy seems bass-ackwards. Instead, they compare the pros and cons of the Lone Gunman, the Global Team, and just going without DBA attention altogether.

As a DBA in the 2000s, I was worried that outsourcing & remote DBA firms would take over conventional DBA jobs. They didn’t. In the early 2010s, DBAs were similarly concerned about cloud services taking over DBA jobs.  That hasn’t happened yet either, and based on what I’m hearing from companies, I don’t think it’s going to happen in the near term (3-5 years).

Today’s Options are Bad, but the Problem Still Exists

I know a lot of my clients would love to get this problem solved – but I just don’t have a good answer for them. Relational databases aren’t getting easier to manage. In fact, with every new release of SQL Server, I see new ways for customers to shoot themselves in the foot with ever more powerful weaponry. (Don’t get me wrong – I love new features, but they don’t manage or configure themselves.)

Part of tomorrow’s answer will be software. For companies with SQL Server and sysadmins, I like where Idera’s going with SQL Elements. For shops where the developers manage SQL Server, I love Stack Exchange’s open source Opserver. Both tools still require meat popsicles to interpret the dashboard and take action, but they’re bringing insight to users without formal DBA training.

I love watching this market as it grows and changes over time. I don’t have a horse in this race – I’m never going on call again, and we don’t want to hire employees to put them on call – but I get excited every time I see a new solution.

Because I don’t want you to be on call either.

Posted on by Brent Posted in Blog Posts

About Brent

I'm a geek goofball who lives in Chicago, Illinois with my girlfriend Erika, my small dog Ernie, and a lot of bad habits. I like wine, travel, photography, and sharing everything I've learned so far.

24 Responses to The Problems with Remote DBA Companies Today

  1. Sean Long

    You mention an interesting concept here – companies using relational databases in mission critical applications that feel like they don’t have enough work for a full time DBA.

    I wonder how true that ends up being if you consider the full scope of what a DBA should be doing? It seems to me that if a company uses a DB enough to consider it mission critical and need 3-9’s uptime or more, there would be enough work to justify a DBA. For instance, plotting out database growth and forecasting bottlenecks and query design and DR/HA planning/testing, as well as helping create/tune the usage of that database… like if the reporting people need help getting the data out in a usable way…

    This may just be my own naivety playing up, but it seems to me that something feels wrong about considering a piece of IT mission critical without supporting it.

    • Brent

      Sean – take a step back and think like the business. Every job could be done better if we spent more money on people and practices. The offices could be cleaner if we hired more janitors. The accounting could be more accurate if we hired more accountants to double-check and classify things. The applications could be faster if we hired more developers.

      At what point is it enough?

      For many businesses, “enough” happens long before someone plots out database growth and forecasts bottlenecks that aren’t here yet. After all, a simple code change or an additional field in a table can render all that forecasting useless. Heck, even at StackOverflow.com, one of the 50 biggest web sites in the world, there’s not enough work to keep a DBA busy full time because the sysadmins (admittedly top notch guys) take care of the database work.

      • Sean Long

        That’s a really good point. Minimum Viable Product and all that.

        Gotta keep the entire “being profitable” thing in mind. Thanks for the example of Stack Overflow too, I had no idea they didn’t have DBAs on staff. Although considering their sys admins’ skill sets, would you consider some of them DBAs by skill set (instead of title)?

        • Brent

          Sean – well, they’d also be programmers and architects by way of skill set, but you can’t use someone’s skills as a definition of their job. You have to look at what tasks they perform on a regular basis. (For example, I’ve got DBA skills, but I’m definitely not a DBA these days either.)

  2. Paul

    My company has been running the remote DBA model (your option number 2) for 15 years now. Over 20 full-time staff and all are senior DBA’s (Oracle, SQL Server or both) – we don’t do juiniors and we make a point of having everyone on staff rotate through clients servers to do health checks and the like. Sure there’s always a few staff intimately familiar with a given customer but at least everyone else is mostly familiar – and you can always call on someone else if you don’t know the environment well enough.

    The key to making it work is finding the people who are happy to answer the phone at 3am [Note to self: not Brent Ozar] from clients or other staff – and paying them well for it.

    Low staff turnover and continual business growth suggests this model works well for us. But the whole market might be far different down here in Australia.

    • Brent

      Paul – when you say “always a few staff intimately familiar with a given customer”, how many servers/applications do you think one person can be intimately familiar with?

      Not a hypothetical question, really curious here.

      • Sean Long

        At some point there’s gotta be a diminishing set of returns for that sort of thing. In product support, we found that you could cross train someone to maybe one or two additional products but after that it just became too much to keep straight. I’ve never had to manage more than a couple labs’ worth of SQL instances, but I’d imagine that the definition of being “intimately” familiar with an environment gets looser the more rigs you add.

      • Paul

        That depends on the person really. We have staff that can recite by rote the exact configuration of every one of several clients servers and can tell you exactly what issues have happened for the last 5 or 10 years.

        We also have staff (like me) that don’t have that level of memory… so I would be generally familiar with most of our clients, but not intimately with any. Then again I’ve only been with this company for 4 months now, so I haven’t had the time to build up that level of familiarity.

        There’s nothing wrong with you not being the type of person that can do that – you have your strengths (plenty of them) and this obviously isn’t one of them – but that’s not to say that there isn’t the people out there that can build a working implementation of that business model – and I believe the company I work for has that.

        The point I was trying to make, which perhaps I didn’t convey adequately while writing my previous response at 5am from my galaxy phone with a cracked screen that’s hard to read and type on is thus:

        You have asserted, specifically dealing with “option 2″, that a dedicated remote dba company is not financially viable unless you have lower paid staff that do the grunt work and escalate to the few senior DBA’s when the need arises. As I mentioned we employ about 25 staff, all are senior level DBA’s, all with around 10-15 or more years experience. It is a point of difference with other remote DBA companies out there that might adopt your suggested model – we can give the client the assurance that nobody with less than 10 years experience is going to be on their servers checking random logs or the like and calling that a good service.

        The company has done, financially, quite well out of this model, and our clients have benefited substantially from the volume and breadth of experience on staff.

        Should there be a situation where 3, 4, 5 or more of our clients have major mission critical issues at the one time – well we’ve got enough staff on board to deal with that. Though I can say, touch wood, that such a situation has not occurred yet.

        • Brent

          Paul – I appreciate the lengthy email, but you’re dodging the question.

          How many servers can one person intimately know?

          It’s a simple question – go ask those coworkers that you’re bragging about. Let’s hear the answer.

          • Paul

            I believe the first paragraph answered that perfectly well. You want an exact numerical value? How long is a piece of string?

            Ok, I took the bait and asked the question – “definitely more than 20″, “a couple of dozenish” and “about 30″ are the replies from the three I asked.

            So what’s your magic number that you think people can no longer keep track of servers?

            I’m not suggesting that having a remote DBA to look after things is going to work for every business, it’s obviously not. But there is plenty of businesses (at least in my geographical region) out there that do benefit substantially from this model.

          • Brent

            Paul – perfect, that makes sense. Let’s continue to think through this, assuming that about 30 is the magic number (since it was the highest one any of your team gave).

            Think of it as kinda like a RAID 5 problem: how much data do we need to protect? In this case, it’s knowledge about servers rather than data, heh.

            Let’s also assume that each team is made up of 5 DBAs. At any time, one DBA has to be unavailable without affecting the company. (People take vacation, call in sick, or are working on a case.) So if one DBA disappears, the other remaining 4 have to be able to completely cover his knowledge. 30 servers divided by 4 people equals about 8 servers that each person has to know, even though it’s not their primary responsibility.

            If each person knows 30 servers, minus the 8 they have to know for someone else, that’s 22 servers they can cover – except that assumes that we all know which one person will disappear in advance. In real life, that’s not the case – so each person can know even less servers, because they have to be able to cover for others.

            So now comes the really cool question: how many servers can each person manage? It has to be less than 30. How much less depends on how much the company is willing to sacrifice service. In a perfect world, you’d only manage half of 30 – because you’d have a buddy system where every server is known by at least 2 people. That means each DBA is managing 15 servers or less. And if they’re all senior DBAs, this is where the numbers start to fall apart.

            Unless of course you’re cutting corners and billing customers for more than 20 servers per team member.

  3. Jim Murphy

    Interesting perspective Brent. Sparked quite a lively conversation. You make a lot of really good points and it shows that you’ve been in the business.

    We are option 1.5. We use C#/Powershell to automate a bunch, use a Jr. DBA to ensure the automation is working properly and a few other DBA’s (Sr. and Mid) for HA/DR, Query/Index tuning, capacity forecasts, etc. Very small team though. A few DBA’s, a few C# developers and a few back office admin’s and me. Our goal is to stay fairly small and never become a big #2 (pun intended).

    I think the answer to your number of server’s question is “it depends” (of course, right?). Depends on how complicated the server/environment is and also how loaded the server is. An AlwaysOn deployment combining FCI for auto-failover and AG for Async reporting/DR is quite a bit more complex and time consuming than 3 or 4 stand alone servers that are internally used by 15 of a client’s employees. But your argument is a good one and the same argument used by a CTO to determine when a 2nd, 3rd or 4th FTE is needed – issues that only the larger corporations (and small part-time consultancy companies) face. If only SELECT MagicWand FROM dbo.MagicHat returned a row or two.

    It is a tough business; I’m not going to lie. Unlike Paul who’s business sounds pretty smooth and easy, apparently from hiring only Sr folks, our world is more difficult and we stay pretty active. We are not living in chaos, but it is also not a walk in the park. And yes, we occasionally wake up in the middle of the night (Blah) to figure out why a reindex failed, why a client employee copied a bunch of files up and filled a drive, and all of the other normal DBA issues that regular FTE’s also deal with.

    But I suppose if it was a pretty easy job, we wouldn’t be needed nor have any customers.

    • Brent

      Jim – good to hear from you sir!

      Yeah, as I mentioned in the post, I think software has to be a part of the solution moving forward. It’s always been a small part – DBAs would write their own DMV collection scripts, for example – but the software has to get dramatically more advanced to cover scenarios like the FCI + AG thing. Even still, when the poop hits the fan, the customer wants to be able to pick up the phone and talk to a DBA who’s well-versed with their systems.

      I’m glad you brought up the client employee issue – it’s so hilarious to watch the politics around that. I’ve seen That Guy at a client repeatedly undo all the good work done by the DBA – whether the DBA was a full-time employee or a remote contractor – and that creates a lot of ill will on both sides. An in-house DBA can lay down rules and lock down access, but it’s much tougher for an outsider to do that – especially when the outsider has to keep the customer happy and keep bringing in dough.

  4. Paul

    You’re going to face the exact same lack of redundancy being a fly in/fly out consultant-type, only with that model there is pretty much no possibility that you can work on two customers at once if you happen to have two disasters miraculously strike at the same time. In the remote case while waiting an hour and a half for that 5TB database to restore for client one you can at least make a start on problem client number two.

    Sure, we can argue on the worst case scenario of having 15 of our clients each having 3 of their servers go completely arse-up at exactly the same time, or we can accept that in the world of business you make certain assumptions and evaluate the potential risks. In the case of your business you make the assumption that Ozar Unlimited isn’t going to have more than 5 critical consulting jobs to attend to at any one time – if there is you may need to start by cutting Jeremiah in half, or you accept that as a risk associated with the finite number of staff you have and tell the 6th client that they either have to wait a week or two for someone to finish the current consulting job, or they go and see what Paul Randall’s team is up to, or whatever other consulting firms might be around to address their issue…

    The exact same goes for the business I work for – we accept the risk that if the planets all align in such a way that the gravitational force causes solar flares to kill all our clients servers then there’s going to be some issues – and in acknowledging that we then work on the assumption, which has served the company well for 17 years now, that such a level of complete disaster is so unlikely that it’s not worth the cost and effort to attempt to mitigate it.

    Whether you approve of the level of risk or not is an issue for you to deal with. I’m sure though that you can acknowledge that even organisations with their own in-house DBA(‘s) are always going to make resourcing decisions based on careful analysis of the potential risks.

    In any case these replies are getting longer than the original blog post so how about we just agree that we have differing views on this part of the world and that we take different approaches to the world of SQL Server. You can continue to enjoy the success you are having with the manner in which you frame your business and I’ll do the same with mine.

    • Brent

      Paul – you’re starting to get it! Two crucial details though.

      My firm is scheduled weeks in advance and we don’t do on call work. (I said that at the beginning of the post.) There’s no scheduling risk involved in my model.

      In your model, YOU are making the risk decisions.

      Not your customer.

      That’s what I pointed out in the post. Sure, YOU can decide to oversubscribe and take on more clients than you could serve at once, but the client doesn’t know how oversubscribed you are, and they don’t get to make those risk analysis decisions. When they hire FTEs, they do.

  5. Pingback: (SFTW) SQL Server Links 24/01/14 • John Sansom

  6. Russell Tye

    Both of you make good points. It appears obvious to me that Paul’s firm and it’s business model has been successful for many years and I certainly would not knock it. On the other hand, Brent also makes a very good argument. It almost seems to be two sides of the same coin.

    • Brent

      They are indeed two sides – the business’s side, and the customer’s side. I was writing to the customers.

  7. Scott

    Two Questions
    1) How long before PaaS options like Amazon RDS or SaaS start to take a real bite out of the number of DBA jobs?
    2) Curious what the job satisfaction is for most DBA’s. I know the pay is good but with on-call issues and potential ly a shorter career have many people considered changing to another field? If so what field and why?

    • Brent

      1. I think the real appeal of those types of platforms has always been for green-field (brand new, from scratch) app development. Those scenarios didn’t need a DBA anyway – at least right away – so I don’t see a short term change there.

      2. For a while, I saw a lot of database people gravitating to the BI field. More user-facing stuff, less on-call work.

  8. Kevin Wood

    While this post is about/directed at DBAs, just about any aspect of IT or skillset can be ‘commoditized’ in the same manner. And you can expect the same trade-offs and consequences. Smaller (and some larger) organizations are looking to move their e-mail to the cloud. IaaS and PaaS move different levels of responsibility from the organization to the provider. But remember, the provider, in order to keep the providers costs down and improve profitability also makes the tradeoffs as to how much to pay for those skills the organization is outsourcing.
    The organization must retain sufficient skills in-house to monitor the provider to ensure the provider is actually providing. I have actually been retained in the past to watch the provider watch the hosted servers. And typically if something went wrong, I (representing the organization) had to talk the provider through repairing the servers the organization was paying the provider to support. So your provider might not be putting superstars on ‘your’ servers.

  9. Ron

    Need a part time remote DBA? Send them my way!!

  10. Cosmo

    Brent spot on as always. This year I just crossed the 20 year mark as a lowly DBA with an on call rotation no less . I have worked for small companies and Microsoft themselves. I had a company called DataMark tell me they were not an IT company but a Marketing company I explained to them Data implied database they simply did not care. Companies like people are biased and ignorance runs wild in the streets like a frat boy on spring break! At the end of the day I love my current role because they get it. I’m part engineer, part builder, and part fireman. Some days I work hard with after hour’s rollouts to DR Failovers to ouch events, others I’m realistically done by 3 PM, but I own the environment and I’m the one stop shop for all questions SQL server as far as management is concerned. Lone gun I’m not I have Oracle DB’s that assist me and a junior SQL DBA under me. If any of them smell a farce of course they will sound an alarm peer reviews matter! What I am is a Subject Mater Expert and that matters! Quality really does matter I think you implied this in your write up. I just may be mad enough to think that this job is fun but at the end of the day those machines make the company money. If there down SLA’s and reputations are on the line. This is why this work will always be in house and managed as such by people who care and are the wizard of OZ behind the machines that is the difference. I love rants carry on….

  11. Ed

    Oh Brent, you are one funny guy.

    Always love the way you write your stuff. I can only wish I get the chance to work with you :-)

    Remote DBA are ‘good’ until you tell them I need something fix ASAP. I hate the meter and the timer that my remote employer put on my desktop like the first thing I have to do to get the user back online when the db is hang is to re-start the database … ha ha ha …

    And so I finally kindly say NO THANKS … I will still like to talk to meet real people in person

Add a Comment