PDA

View Full Version : The outages... and expect a short one Tomorrow or Friday...


spark@onestopknifeshop.com
March 2, 2000, 01:46 AM
Talk about a frigging comedy of errors...

The last few days has definately been a stress-fest to say the least.

Here's the lowdown on what happened, and what's going on...

TFL and BladeForums.com currently reside on a server co-located with a Jacksonville, Fl (my area) ISP. This ISP was bought out a few months ago, and they've been building a new Class 5 facility locally for their operations (Class 5 means it can take some seriously bad ju-ju and still operate).

This means that when the building was completed, all the servers were scheduled to be moved over to the new facility. Unfortunately, prior to this move, the ISP failed to provide us with all of the information that we needed in order to make a decision on whether to stay or go. We said that pending more infomation, we'd stay, but we've been hounding them for the information. Naturally, they managed to overlook us completely, despite our calling twice a week, for the last 3 months.

So, Monday, bright and early, we come into work and find that we have no access to the server. Why? Because our ISP chose that morning to move everything to the new facility. Joy.

And they didn't contact us. Great.

And when we called them to find out why, they gave us attitude that it was our fault for not filling out the paperwork that we should have gotten, and for not contacting them (?!?!? Mistake #1, never assume that even though you do everything you can to make sure that the bases are covered, that they will be covered. AKA, never underestimate the incompetence of others).

After some nice screaming and threats of bodily harm, the person in question was informed that he was in error by his subordinates, and plans were made to get us back on the network.

Cut to scene two - my boss and I go to visit our server, sitting all by itself in this now mostly vacant building, boy is it dark and quiet without all the other servers! We'll, since the server is off the network anyway, this sure seems like a good time to upgrade the server software and kernel! (Mistake #2 - no upgrade ever goes smoothly, especially on mission critical equipment).

Naturally, the upgrade crashes and burns, leaving us with a server that doesn't serve webpages. Around this time, we're hooked back onto the network though... thank heavan's for that, one less problem to track down...

This sounds like a call for Tech Support! Unfortunately, unlike Microsloth, RedHat's tech support is $225 per incident, and it took 1 of those to get us to even get the server to the point where it would connect to the network. It turns out that the person who set up our server did so in a "non-standard configuration" (Mistake #3, never think that the previous guy did things the right way) which wound up screwing us when we upgraded, and it took the best part of 3 hours to even get it to the point where it would look out at the network (though it wouldn't serve pages - the connection refused errors you guys were seeing).

At that point, the tech said, sorry, follow the installation instructions, and this is where I say "Time", you are at the end of your call. (?!?!?).

Fine, time to call it a night, regroup, and come back refreshed and ready for battle. Since we're now at the point where I can remote into the server without a problem, we're a lot better off anyhow because I can work from the office.

I come in yesterday bright and early, and call RH tech support again for another troubleshooting call. This was at 0900. It doesn't end until 1630, at which time BladeForums.com and TFL are back online. Then we have the other problem with the forums not quite working correctly after the server upgrade, which lasts until around midnight, when I finally get that tracked down. I finally got the server and all of the hosted sites up 100% today at 1700.

But, in the end, the server software is upgraded, things are all patched up, and everything seems to be working just fine right now, though there are still some background issues I'm looking at.

Just a heads up, we'll be physically moving the server some time tomorrow (it's still by itself in the old building, remember?), so expect an outage for about an hour or so while we move it.

Rich, you can turn the search back on, and when you are trying to update threads, do it at 30 per, that should stop the crashing problem you were having.

Thanks for your patience, guys, I appreciate it. I'm not a Linux guru, but I busted my ass to get us back, and I'm hoping that we won't have to go through that any time soon.

Spark

------------------
Kevin Jon Schlossberg
SysOp and Administrator for BladeForums.com
www.bladeforums.com (http://www.bladeforums.com)

fastforty
March 2, 2000, 03:19 AM
Wow, and I thought I was having all the fun when our motherboard earned it's name, and wifey couldn't get to her garden chat for three days. :D
Seriously, thanx for all of the hassle, I was starting to have nightmares again.

George Hill
March 2, 2000, 06:50 AM
Spark - You should get a TFL Medal of Honor.
Thank you.

TheBluesMan
March 2, 2000, 08:37 AM
Thanks for going above and beyond Spark!
:D :D :D :D :D

Rich Lucibella
March 2, 2000, 09:57 AM
Spark-
Thanks for the hard work. The Update finished overnight and the Search function has been restored.
Rich

Gunslinger
March 2, 2000, 11:23 AM
Thank you Spark. Your efforts were appreciated. :)


------------------
Gunslinger

We live in a time in which attitudes and deeds once respected as courageous and honorable are now scorned as being antiquated and subversive.

Mal H
March 2, 2000, 12:01 PM
Spark, great work. Perhaps more than most here, I know exactly what you were going through. In my younger days I was a mainframe computer geek instead of a mini/micro computer geek like now. If you want pressure, imagine having a computer go casters up for more than a day at the weather bureau, NSA, CIA, FED, or the worst one - at the White House. Believe me, none of those guys have any sense of humor at all like we do here. :) I have filled the skies with jets full of wingtips and white shirts many times. **it happens.

K80Geoff
March 2, 2000, 12:39 PM
Spark. Thanks for your superhuman efforts from all of us common members at TFL.


Geoff Ross

ernest2
March 2, 2000, 01:06 PM
Sparks, thanks for fixing tfl for us all.
When it was down ,I felt as if my dog had just died.
Funny how you get addicted to something like tfl.
I am sure many share my feelings on this and
know what I mean, so to speak.

------------------
-They call 'em POLUTE-TICIANS because they POLUTE the MINDS
of OUR CHILDERN with their ANTI firearm RIGHTS SOCIALIST
political agendas. We of the older generations know B.S.
when we hear it.
-----------------------------------------------
In 2000, we must become politically active in
support of gun rights or we WILL LOSE the right
& the freedom.
-------------------------
NO FATE BUT WHAT WE MAKE!!!
----------------------
Every year,over 2 million Americans use firearms
not to take live but to preserve life,....limb & family
.Gun Control Democrats would prefer that they are all disarmed
and helpless and die victims of felony violence,instead.

Protect your gun rights, go to:
http://home.xnet.com/~gizmonic/TheMarch.html
and sign up as a helper or attendee or state organizer.
ernest2, Conn. CAN opp. "Do What You Can"!
http://thematrix.acmecity.com/digital/237/cansite/can.html

12-34hom
March 2, 2000, 02:18 PM
Thank you kind sir for your efforts. This place is an addiction or was that affliction? never mind... thanks again for your efforts!

aztec777
March 2, 2000, 03:21 PM
Spark-Thank you for your hard work. It is very appreciated. I was having DT's and couldn't eat. ;)

Steve

Bud Helms
March 2, 2000, 05:08 PM
Well done, Spark. Thank you.

Jason Demond
March 2, 2000, 05:14 PM
Great work guys!

HankL
March 2, 2000, 08:02 PM
Kevin and Rich, "Thank you, thank you, thank you very much!" Forwared to me by a friend.
Hank