Techiemedia down 9+ hours!
Collapse
X
-
The problem started at 10am EST when we immediately noticed that our links were down to our second datacenter. We saw link but no activity on our 2 sets of fiber that connect our 5th floor datacenter with our 8th floor datacenter.
We run a layer-3 switch setup with load balanced hot spare connected to our core network between floors to light our second facility. Our immediate thought was that the primary switch failed in some way and the secondary did not take over like it should. This lead to a very long trouble shooting process with our tech even involving the switch vendor who had us update IOS and just about every test they could thing of with no result since the switches were identical in every way we had to be sure. We then decided to put a cold spare switch online to take their place which was a different brand and we had the same results, link and no activity.
The next step was to have the fiber company that handles our dark fiber test everything and also the the building meet me room check everything, they both told us the problem must be with our equipment.
We went back to more trouble shooting and actually brought the primary switch from the second datacenter all the way up into the main datacenter and connected it to the core on short fiber and it was working fine. It just did not work when connected between floors.
We then went back and demanded that the fiber company take a second look. This time they DID find and issue and corrected it and our second datacenter was back online.
This is one of those times we did everything we possibly could, worked tirelessly with all vendors that were involved and the problem was still caused and then prolonged by something we could not control as the hours of the day just seemed to fly by. So, to sum it up we got hit in a soft spot and it hurt.
We have been in constant contact with our customer and many have been told what we just posted by phone and ICQ.
The situation has been 100% resolved.Comment
-
-
50 router fixes
The Only Time When Success Comes Before Work Is In A Dictionary.
Did you ever notice: When you put the 2 words 'The' and 'IRS' together it spells 'Theirs.'Comment
-
right on - seems like my stuff was in the other datacenter thenComment
-
Sounds like maybe you need a second alternate path to your second datacenter in the event of failure.The problem started at 10am EST when we immediately noticed that our links were down to our second datacenter. We saw link but no activity on our 2 sets of fiber that connect our 5th floor datacenter with our 8th floor datacenter.
We run a layer-3 switch setup with load balanced hot spare connected to our core network between floors to light our second facility. Our immediate thought was that the primary switch failed in some way and the secondary did not take over like it should. This lead to a very long trouble shooting process with our tech even involving the switch vendor who had us update IOS and just about every test they could thing of with no result since the switches were identical in every way we had to be sure. We then decided to put a cold spare switch online to take their place which was a different brand and we had the same results, link and no activity.
The next step was to have the fiber company that handles our dark fiber test everything and also the the building meet me room check everything, they both told us the problem must be with our equipment.
We went back to more trouble shooting and actually brought the primary switch from the second datacenter all the way up into the main datacenter and connected it to the core on short fiber and it was working fine. It just did not work when connected between floors.
We then went back and demanded that the fiber company take a second look. This time they DID find and issue and corrected it and our second datacenter was back online.
This is one of those times we did everything we possibly could, worked tirelessly with all vendors that were involved and the problem was still caused and then prolonged by something we could not control as the hours of the day just seemed to fly by. So, to sum it up we got hit in a soft spot and it hurt.
We have been in constant contact with our customer and many have been told what we just posted by phone and ICQ.
The situation has been 100% resolved.Latest MMA news. http://www.mmawrapup.comComment
-
wow, 16 hours is some serious downtime! Scary to see that! hopefully everyone that was effected is back online and this doesn't happen again.
As Rowan stated always have proper backup and recovery plan in place. Never know what can happen!
**RIP TD** 
Comment
-
dey too small for webhosting talks
How come it takes 16 hours of downtime for da host to tell you why you was down? Seriously folks dont you think dey shoulda posted updates throughouts?
I think In dis day an age if your host is down dat long there some serious issues wit em.


I'm not Ali A, not Ali B, Ali C, Ali D, Ali E, Ali F... but... Ali G!
Booyakasha!!!!


Need Content? ADULTCENTRO ROCKS! ADULTCENTRO.COM


Comment
-
Yup, backups are something anyone should do. But sometimes ppl are too lazy to even think about it... I do my own backups even regardless of all my sites are backed up by the hosting company.Just sidetracking for a moment, and this has nothing specifically to do with Techiemedia...
How many of you have complete backups - or at least enough to get you going again - to recover from a catastrophic event such as fire physically destroying your server, or an extended outage?
Or something similarly devastating such as a host quietly going out of biz, and by the time you hear of it your server has been trucked off to who knows where to have its HD erased and given to a new customer?
(And if you have no backups - then something as simple as a HD failure could completely wipe you out...)
Just something to think about.
Comment
-



Comment