“You just deleted a live fileserver, huh?”

Or, Tales From The Help Desk Ch 2.

For our next installment in the Tales From The Help Desk series, let’s take a look at one of my all time favs.

OK, “fav” may set the wrong tone.

It’s more of a “hands-down the most idiotic maneuver I’ve ever seen” kind of thing.

Since the title gives away the punchline, I’ll jump right into the background. This happened at the same consulting firm as the first incident involving Cake Batter Boy. So you know it’s gonna be epic. Primary difference is this bonehead move was accomplished by a member of our IT team. Yep, a system admin with a fair bit of experience in fact.

Not our actual server room

Here’s the set-up.

My team was an offshoot of the main help desk team and had an office/workroom about 20 yards down the hall from that teams walk-up window (we didn’t have one of those, a fact I am forever grateful for). One morning, if I recall it was just after our morning rush of “aarrgghhh, my laptop did a thing” visitors had died down, we hear a commotion from the hall down by the window. Since they weren’t open yet, I sent one of my guys down to see if we could be of any assistance.

He came back 5 minutes later choking back fits of laughter.

Well that can’t be good.

After he calmed himself by reorganizing the power adapter bins in the back room (don’t ask, that was his thing), he was able to tell us that the ruckus was related to something we had noticed earlier but didn’t realize was company wide…the primary server that most departments keep their proprietary work on was unreachable. I had actually been working on a file on that server earlier that morning, then got distracted by our walk-ins and hadn’t gotten around to trying again yet.

It’s more of a “hands-down the most idiotic maneuver I’ve ever seen” kind of thing.

Background Info break

This was a consulting/research/software (there had been acquisitions) firm that worked primarily in the healthcare world, with an offshoot group that dealt with higher education institutions. That means we dealt with a LOT of HIPAA data, as well as our own proprietary code related to the software we developed for our healthcare clients. And the majority of that data was stored on employee network drives to avoid having it accessible should a laptop be stolen (stay tuned, there’s a doozy of a story about that coming later in the series).

And the majority of those employee network drives were, you guessed it, on one server. There were redundant copies and backups, of course, we weren’t complete amateurs. But the vast majority of the work being done on a daily basis was done on that one server.

We’re not amateurs, Chuckles may be…
Each one of these drives may contain several virtual servers

That comment about us not being amateurs? Yeah.

This server was not it’s own, discreet piece of hardware residing in a secure, temperature controlled server room. It was a virtual machine. A virtual machine is a way of running multiple instances of an operating system on a single piece of hardware. So there can be multiple servers that are all running on one single physical computer in the server room. It’s more cost effective in a lot of cases, and with the hyper-computing level power available these days, it’s generally a great way to do things.

The thing about virtual machines is that to the host operating system, they look like files. And files are easy to delete.

Enter Chuckles. Not his real name. (ed: did that need to be said?)

Chuckles was one of 4 system admins that handled the day to day operations of our production servers. “Production” here means the servers that got used by our internal employees on a regular basis. Some hosted development environments that our software engineers used to test updates and other work, some hosted files accessed by everyone from the mail room team to the CEO, and some hosted backups. Some were hosted locally in our on-site server rooms and some were at a server farm out in the ‘burbs (Ashburn, Va. maybe?). There were other teams who handled the development sandboxes (areas where the developers could test new technology without the possibility of bringing the whole network crashing down around our ears).

Chuckles was doing routine maintenance on one of the backup servers when he noticed what he thought was a redundant copy of the main file server. Did Chuckles check to verify which physical server he was working on? No. Did he right click and check for active connections? No. Did he verify when the files had last been modified? No. No he did not.

Did Chuckles ignore 3 warnings that this was NOT, in fact, a backup and was, in fact, a current production server that he was deleting?

Seriously?! maybe don’t delete the production file server next time?

Yes. Yes he did.

And with that, our primary file server was gone. Every employee who had a file open that was stored on that server was left with an error and a morning of lost work. And the icing on this shit cake? Chuckles took a smoke break immediately after hitting Delete so nobody knew what had happened or could find him to ask.

The saving grace for my team and by extension my day? Other than a handful of files I worked with, our tools were stored on a completely different sub-network. Add to that the fact that we were the hardware team, and we were relatively unscathed.

Not so for Chuckles. To say he got a write up is an understatement. In fact, nobody saw him for the bulk of that day as he was shuffled off to multiple C-people’s offices to explain himself while his team worked to fix his mess.

We trusted you NOT to delete a production server Chuckles. How could you betray us like that?

Leave a Comment

Your email address will not be published. Required fields are marked *