It’s time for an intervention. Here’s 3 signs you might be a data hoarder – see if you recognize your databases in Wikipedia’s list of compulsive hoarding symptoms:
1. You tend to hold onto items that most people would consider not useful:
- Old catalogues and newspapers
- Freebies picked up, like cached copies of Internet data
- You say, “Drives are cheap, so it doesn’t really cost much to keep this around”
- Things that “might” be used one day
2. There’s so much clutter that servers and drives can no longer be used for their intended purpose, and some of your data is inaccessible. Examples include:

- Drives that are constantly out of space
- Turning on compression just to cram more data into less drives
- Too many
animalsservers and you’re unable to take care of them anymore - Being surprised by data (“Oh wow, I forgot I even had that!”)
3. The clutter is so bad it causes illness, distress, and impairment.
- Visitors such as consultants aren’t invited in because it would be too embarrassing
Family memberscoworkers argue a lot about the clutter- Feel depressed or anxious much of the time because of the clutter
- You consider migrating to the cloud in order to get more space for your clutter
How to Get Help with Your Data Hoarding Problem
If you believe the above heading refers to a new way to process your “big data”, think again.
When you hire another “data scientist” to tell you how incredibly valuable your data can be, think about that person’s motivations. You’re paying them to tell you if your junk is worth something, and then you’re paying them again to take their time sifting through your junk.
Sooner or later, you have to wise up about the data you keep. Maybe there’s not a lot of value in the web site clicks your users made two years ago – back when your web site looked totally different, and you can’t even recreate that web site anyway because you don’t have a working version of it compiled, so you don’t even know what the links mean, and then you have to spend even more time getting valuable business people involved to try to remember what /sites/purchase/preorder/item45.aspx?special=1 was all about. I’m just saying.
Because if your office looked like your database, you’d be on television.
And I don’t mean America’s Next Top Data Model, either.
4 Comments. Leave new
I couldn’t agree more, sir! This is hilarious since I just got done tweeting last night about my obsession with watching Hoarders on Sundays. Then this post! Coming from someone who works at a place with no real retention schedule, I can relate! It’s like trying to get grandma to clean up the hoard, only to have her yell at you while her cats are taking dumps in the corner of the living room. You can’t organize crap.
Are you trying to tell me that I’m supposed to tell my employer that their data is garbage? Oh wait, they already know that. They’re okay with garbage data. It’s now my job to make the garbage look pretty.
Actually I gather there’s kind of a divide between Big Data folks and Data Science folks. Anything with masses of data tends to be archived by the Big Data types, or associated “DevOps”. When you talk to living, breathing data scientists, they tend to be looking at just a gigabyte or three they selected for analysis. Maybe at some point they publish an analysis that is supposed to then be applied to the full petabytes, but it’s out of their hands at that point.
I don’t think anybody out in the world tries to save their “fire hose” data from phones or twitter, they hold enough to process and then dump it to make room. Of course the big boys Google and Facebook, are in a category of their own.
Nice one Brent. It strikes me that a lot of “Big Data” is akin to tidying your bedroom by stuffing everything under your bed. You get a very quick result, but when you actually want to find something, it takes ages.