Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Post New Thread Reply

Register GFY Rules Calendar Mark Forums Read
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed.

 
Thread Tools
Old 11-13-2005, 09:40 AM   #1
fusionx
Confirmed User
 
Industry Role:
Join Date: Nov 2003
Location: Olongapo City, Philippines
Posts: 4,618
Hosted at Sagonet? Whole &%$! datacenter is down

grrrr.. see topic

Gotta say - first time anything like this has happened to me in about three years. Sagonet has always been pretty reliable.
fusionx is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:25 AM   #2
crockett
in a van by the river
 
crockett's Avatar
 
Industry Role:
Join Date: May 2003
Posts: 76,806
So did they forget to pay the power bill or something? Someone cut a fiber? Why is it down?
__________________
In November, you can vote for America's next president or its first dictator.
crockett is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:28 AM   #3
fusionx
Confirmed User
 
Industry Role:
Join Date: Nov 2003
Location: Olongapo City, Philippines
Posts: 4,618
Quote:
Originally Posted by crockett
So did they forget to pay the power bill or something? Someone cut a fiber? Why is it down?
No idea.. I talked to my reseller who talked to them - they didn't have time to go into detail.

Is Flashcash.com hosted there? It's down as well.
fusionx is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:31 AM   #4
chowda
Confirmed User
 
Join Date: Jun 2003
Location: Gooch city
Posts: 9,527
fuck, i wish i backed up...
__________________
Someone finds you...
2007

PS: Nationalnet is the best host I've ever had. And i tried alot of them.
chowda is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:31 AM   #5
boner 2.0
Too lazy to set a custom title
 
Join Date: Jul 2004
Posts: 10,970
Quote:
Originally Posted by fusionx
Is Flashcash.com hosted there? It's down as well.
www.flashcash.com [207.150.181.34]

arin:

Sago Networks SAGO-20040121-1400 (NET-207-150-160-0-1)
207.150.160.0 - 207.150.191.255
Rhino, LLC SAGO-207-150-181-0 (NET-207-150-181-0-1)
207.150.181.0 - 207.150.181.255
__________________
boner 2.0 is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:34 AM   #6
gdog
Confirmed User
 
Join Date: Aug 2004
Location: San Diego
Posts: 1,550
Damn that sucks, I had it happen one time about 5 years ago and it is a pain in the ass with the customer service e-mails and the webmaster e-mails, and that is if you keep your e-mail seperately on an inhouse server and not theres. Otherwise everything gets bounced.

G
__________________
Busty Amateurs is Back - Meet the Girls | Read the BLOG | See the UPDATES

High conversions and high retention, almost 1000 hosted galleries 995 PAYS
Samples Big Boobs | Busty Babes | Strap Ons | Hardcore | Amateurs | Videos
gdog is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:35 AM   #7
ejeet
Confirmed User
 
Join Date: Nov 2005
Posts: 258
Just wait, Veterens Day and his fake nicks (simple simon) will be hear to whore ISPrime.
ejeet is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:44 AM   #8
phonesex
Confirmed User
 
Join Date: Mar 2005
Location: Phone Sex Pays! Believe it!
Posts: 3,437
unable to view flashcash.com from here too.
phonesex is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:16 AM   #9
johndoebob
Confirmed User
 
Join Date: Mar 2004
Posts: 3,405
Who hosts there anyway? Their lines are the worst I've seen yet, timeout galore.

If you host at Sagonet you're just asking for problems. The money you save can't be worth the constant downtime you're getting there.
__________________
johndoebob is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:17 AM   #10
Dalai lama
Strength and Honor
 
Join Date: Jul 2004
Location: Europe
Posts: 16,540
yep it's all down. this sucks
__________________

A program you can trust.
Gallerybooster Run multiply TGPs of 1 script
Dalai lama is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:37 AM   #11
fusionx
Confirmed User
 
Industry Role:
Join Date: Nov 2003
Location: Olongapo City, Philippines
Posts: 4,618
Quote:
Originally Posted by johndoebob
Who hosts there anyway? Their lines are the worst I've seen yet, timeout galore.

If you host at Sagonet you're just asking for problems. The money you save can't be worth the constant downtime you're getting there.
This is the first time Sagonet has been down for me in three years. No complaints. Shit happens.

But I'm still upset

Mostly because I had work planned for today. I guess I'll do laundry!
fusionx is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:45 AM   #12
Lycanthrope
Confirmed User
 
Lycanthrope's Avatar
 
Industry Role:
Join Date: Jan 2004
Location: Wisconsin
Posts: 4,517
Everybody's host "is the best!" until things like this happen. While it is unfortunate, it can and does happen to everyone. If it happens a lot, well obviously it is time to go, but if this is an isolated instance, don't judge them by the downtime, but by how they react to it and make up for it.
__________________
Lycanthrope is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:52 AM   #13
tdp-Cool-content
Confirmed User
 
Join Date: Oct 2002
Location: Out there
Posts: 321
go this in the mail:

Dear Valued Customer,

The following maintenance has been scheduled.

Sago Networks router code maintenance notification - Tampa, Fl facility.

Dates: November 13th 2005
Times: 12:00AM to 6:00AM
Maintenance: Core router code upgrades and BGP peer maintenance.
Impact: Major. Customers will experience several instances of network unavailability in durations of 5 to 15 minutes per occurrence.
Sago Networks maintenance ID: COP-63705-228

If you have any questions, feel free to call Sago Networks Network Operations Center at 866-510-4000 and refer to the above maintenance ID.

Regards,

IP Engineering
Sago Networks
__________________
Webmaster Content Cool-content

Webmaster Resources cool-XXXresources
tdp-Cool-content is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:57 AM   #14
fusionx
Confirmed User
 
Industry Role:
Join Date: Nov 2003
Location: Olongapo City, Philippines
Posts: 4,618
Sunday morning.. I should have guessed it was maintenance gone bad. As I mentioned before, stuff happens.. I'm pretty happy with Sagonet overall. I'm sure they'll get it fixed soon.

Quote:
Originally Posted by tdp-Cool-content
go this in the mail:

Dear Valued Customer,

The following maintenance has been scheduled.

Sago Networks router code maintenance notification - Tampa, Fl facility.

Dates: November 13th 2005
Times: 12:00AM to 6:00AM
Maintenance: Core router code upgrades and BGP peer maintenance.
Impact: Major. Customers will experience several instances of network unavailability in durations of 5 to 15 minutes per occurrence.
Sago Networks maintenance ID: COP-63705-228

If you have any questions, feel free to call Sago Networks Network Operations Center at 866-510-4000 and refer to the above maintenance ID.

Regards,

IP Engineering
Sago Networks
fusionx is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 12:01 PM   #15
martinsc
Too lazy to set a custom title
 
Industry Role:
Join Date: Jun 2005
Location: 127.0.0.1
Posts: 27,047
http://www.addfreestats.com is down too....
damn i need my stats...
__________________
Make Money
martinsc is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 12:07 PM   #16
webair
Confirmed User
 
webair's Avatar
 
Industry Role:
Join Date: Feb 2002
Location: NYC, NY
Posts: 8,531
ya its down from NY as well did they send out a maintenance notification prior to the work being performed?
webair is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 12:11 PM   #17
BitAudioVideo
Confirmed User
 
Join Date: Jul 2005
Location: USA, Georgia
Posts: 1,246
they added saavis to the blend (saavis host the nyse and other biggies)
apparently its causing some routing issues

i havent had a reply on when they think it will settle down
__________________
Hi-Quality Encoding - Bulk Orders - On Time!
http://bitaudiovideo.com
icq 50476697 - aim n3r0xXx
BitAudioVideo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 12:12 PM   #18
BitAudioVideo
Confirmed User
 
Join Date: Jul 2005
Location: USA, Georgia
Posts: 1,246
Quote:
Originally Posted by webair
ya its down from NY as well did they send out a maintenance notification prior to the work being performed?
they sent one, fairly last minute
__________________
Hi-Quality Encoding - Bulk Orders - On Time!
http://bitaudiovideo.com
icq 50476697 - aim n3r0xXx
BitAudioVideo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 12:39 PM   #19
dodger21
Confirmed User
 
Join Date: Jan 2003
Location: Los Angeles
Posts: 2,680
I got my notification several days beforehand, not last minute.
__________________
icq: 237055440
dodger21 is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 02:48 PM   #20
chowda
Confirmed User
 
Join Date: Jun 2003
Location: Gooch city
Posts: 9,527
still down. booooo
__________________
Someone finds you...
2007

PS: Nationalnet is the best host I've ever had. And i tried alot of them.
chowda is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 02:50 PM   #21
BitAudioVideo
Confirmed User
 
Join Date: Jul 2005
Location: USA, Georgia
Posts: 1,246
i just got this...

The upgrade this morning was a mass-software upgrade in preparation for some large additions to our network and to maintain some general housekeeping.

The upgrade path was pretty clear cut and, as usual, verified by a third party. The initial upgrade was successful, but around 3am EST certain routing loops causes unusually high processor utilization. This issue result in the datacenter's decision to back out of the upgrade. The process of downgrading the software on the core switches was impeded by a version mismatch between previously existing firware on external routing cards and the management software on the switch itself. This left the datacenter in a limbo between two versions and an inability to reimplement the original code. They are currently waiting on a software patch to be written by the manufacturer to enable to either go full forward or fall back on the old version.

Right now, the most obvious symptom of this condition is packet loss. The datacenter are working with the manufacturer, who is onsite, to resolve this.

We apologize for the on-going issues and will update you as soon as we have more information.
__________________
Hi-Quality Encoding - Bulk Orders - On Time!
http://bitaudiovideo.com
icq 50476697 - aim n3r0xXx
BitAudioVideo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 04:35 PM   #22
r-c-e
Confirmed User
 
Join Date: Jul 2002
Posts: 1,070
I got the maintainence email almost a week ago.
r-c-e is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 04:39 PM   #23
ejeet
Confirmed User
 
Join Date: Nov 2005
Posts: 258
So a planned maintenance fucked up on a biblical scale.
ejeet is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 05:38 PM   #24
BitAudioVideo
Confirmed User
 
Join Date: Jul 2005
Location: USA, Georgia
Posts: 1,246
Here is an update for you all:

Latest updates are that Foundry believes to have identified the cause of the software issues, and will be correcting them. Though this certainly has not gone as planned, we will post a full update as to the process, problems, and resolutions that we have undertaken with this upgrade. The matter will be resolved as quickly as possible, technicians have been working on things round the clock, and will continue to do so. This is not something we are taking lightly, and have deployed every available resource to fix things as quickly as possible.

Thank you,
__________________
Matthew McCormick
Director of Sales
allmanaged.com / sagonet.com
[email protected]
__________________
Hi-Quality Encoding - Bulk Orders - On Time!
http://bitaudiovideo.com
icq 50476697 - aim n3r0xXx
BitAudioVideo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 05:48 PM   #25
davidd
Confirmed User
 
Industry Role:
Join Date: Jul 2003
Posts: 1,076
Yes, we (FlashCash) are hosted there, and we were notified of the scheduled maintenance.

I have been in the hosting industry since 1998, so I know what it feels like when scheduled maintenance turns into a fuck story.

Sago rocks, plain and simple. Unfortunately growth always has pain...
davidd is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 06:22 PM   #26
AdultEUhost
ORLY?
 
AdultEUhost's Avatar
 
Industry Role:
Join Date: Oct 2005
Location: NL & US
Posts: 2,579
Although with redundant routers it should not be a problem I wish Sagonet good luck with restoring the firmware.
__________________
ICQ: 267-443-722 / leon [at] adulteuhost [dotcom]

Nominated for an XBIZ Award as "Webhost of the Year" in 2007, 2012, 2013 and 2014
AdultEUhost is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 06:38 PM   #27
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by AdultEUhost
Although with redundant routers it should not be a problem I wish Sagonet good luck with restoring the firmware.
Redundant routers don't help if your routing setup is buggered.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:14 PM   #28
r-c-e
Confirmed User
 
Join Date: Jul 2002
Posts: 1,070
How's it all going?
r-c-e is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 10:56 PM   #29
bbe
Confirmed User
 
Join Date: Feb 2005
Posts: 110
They're fucking scumbags. Host more botnets and scam pages then anyone else. Probably being ddos'ed.
bbe is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:19 PM   #30
24inchspinners
Registered User
 
Join Date: Jan 2005
Posts: 14
Quote:
Originally Posted by bbe
They're fucking scumbags. Host more botnets and scam pages then anyone else. Probably being ddos'ed.
find one clean hosting comp that has not never had an incident with botnets or scam pages

retard
24inchspinners is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-13-2005, 11:26 PM   #31
tdp-Cool-content
Confirmed User
 
Join Date: Oct 2002
Location: Out there
Posts: 321
They really need to fix this shit soon!!! got damn it
__________________
Webmaster Content Cool-content

Webmaster Resources cool-XXXresources
tdp-Cool-content is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 01:05 AM   #32
NewbieNudes
Confirmed User
 
Join Date: Jan 2003
Posts: 939
Can We Have An Update Please?
__________________
| Click Here to join our unique high converting program

| Add yourself for free traffic!

ICQ: 279 738 569 | Skype: NewbieNudes | Email: affiliates at newbienudes dot com
NewbieNudes is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 01:32 AM   #33
Orsm
Confirmed User
 
Join Date: Aug 2001
Location: Perth Australia
Posts: 252
Quote:
Originally Posted by janos
Can We Have An Update Please?
Bumpo.

Thanks.
__________________
<insert something witty here>
Orsm is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 01:43 AM   #34
Jace
FBOP Class Of 2013
 
Industry Role:
Join Date: Jan 2004
Location: bumfuck, ky
Posts: 35,562
isn't it cool how you can get the manufacturer of the product on site to fix shit when you have money?

that would be like being bill gates and having a dvd player fuck up on me, and calling sony....and sony sending one of their techs out to fix my shit..

haha
Jace is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 02:47 AM   #35
maxxxxx
Confirmed User
 
Join Date: Jul 2003
Posts: 646
Quote:
Originally Posted by davidd
Yes, we (FlashCash) are hosted there, and we were notified of the scheduled maintenance.

I have been in the hosting industry since 1998, so I know what it feels like when scheduled maintenance turns into a fuck story.

Sago rocks, plain and simple. Unfortunately growth always has pain...
I guess as one of the big guys you get extra service and good servers from them. If you are not that big in this business Sagonet is a company I would never ever host anything again. I have been with them for a while but there were far too many problems that had nothing to do with the admin of my server. Too many downtimes, it seemed that everytime they needed to do some maintenance the whole system f... up. They also several times billed my credit card twice for the same period. I got the refunds but hadn't I noticed...
I have now moved to a European host. Same money, or even slightly less, much (!) faster servers. Just by moving away from Sago my income has gone up by about 50%!
__________________


****Teen Harbour**** - Home of Little Caprice
-------------------------------------------------------------------
In a perfect world... spammers would get caught, go to jail, and share a cell with many men who have enlarged their penisses, taken Viagra and are looking for a new relationship.
maxxxxx is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 04:03 AM   #36
d00t
Confirmed User
 
Industry Role:
Join Date: Sep 2002
Location: In your mind
Posts: 3,766
Does anyone have a phone# for them that DOESNT go to an answerring machine?

Just hit 17.5 hours downtime
d00t is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 04:11 AM   #37
ffmihai
keep walking...
 
ffmihai's Avatar
 
Industry Role:
Join Date: Jun 2002
Posts: 7,177
17 hours is a looong time
__________________
ffmihai is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 04:31 AM   #38
NewbieNudes
Confirmed User
 
Join Date: Jan 2003
Posts: 939
Sago Can We Be Notified Of What The Fuck Is Going On Please?
__________________
| Click Here to join our unique high converting program

| Add yourself for free traffic!

ICQ: 279 738 569 | Skype: NewbieNudes | Email: affiliates at newbienudes dot com
NewbieNudes is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 06:26 AM   #39
Screaming
I can change this!!!!!
 
Join Date: Feb 2004
Posts: 18,972
Time to look at nat-net.
__________________
Screaming is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 08:14 AM   #40
SpeakEasy
Confirmed User
 
Join Date: Sep 2002
Location: The Internet
Posts: 2,681
Quote:
Originally Posted by ejeet
Just wait, Veterens Day and his fake nicks (simple simon) will be hear to whore ISPrime.
I think they asked him to stop posting about them and to remove his sig because of how retarded he makes them sound.
SpeakEasy is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 08:26 AM   #41
Boss Traffic Jim
Confirmed User
 
Join Date: Nov 2002
Location: USA
Posts: 1,150
Quote:
Originally Posted by johndoebob
Who hosts there anyway? Their lines are the worst I've seen yet, timeout galore.

If you host at Sagonet you're just asking for problems. The money you save can't be worth the constant downtime you're getting there.
I was going to say something like that as well.
Boss Traffic Jim is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 11:03 AM   #42
chowda
Confirmed User
 
Join Date: Jun 2003
Location: Gooch city
Posts: 9,527
my box is up again
__________________
Someone finds you...
2007

PS: Nationalnet is the best host I've ever had. And i tried alot of them.
chowda is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 11:09 AM   #43
tdp-Cool-content
Confirmed User
 
Join Date: Oct 2002
Location: Out there
Posts: 321
yes there is a connection now, but is it not slower then befor, I have a feeling that it takes ages to load my site now. can anyone else see this on their site?
__________________
Webmaster Content Cool-content

Webmaster Resources cool-XXXresources
tdp-Cool-content is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 11:17 AM   #44
theFeTiShLaDy
Confirmed User
 
Join Date: Jun 2004
Posts: 2,615
that sucks if they were down for more than 5hours.
__________________
I'm a freelance babe!
theFeTiShLaDy is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 12:17 PM   #45
Gemhdar
Confirmed User
 
Gemhdar's Avatar
 
Industry Role:
Join Date: Aug 2004
Posts: 204
Here is a brief update of what had happened during the scheduled upgrade and what is going on to this point. If any one has any further questions please feel free to email me or contact me directly via Instant Messenger...

With that being said...

The network upgrade began previously to Saturday night / Sunday morning with verification that Foundry had successful implementation of the new upgrades into our specific routers and environment.

Beginning Saturday night, the dual core routers were removed individually from HSRP (Hot Swap Router Protocol) and one taken down. Upgrades were completed, tested, and the router was brought back online, traffic switched to pass over it, successful completion and traffic was flowing normally. Then the 2nd core router was also brought down, upgraded, tested, brought back online, finally HSRP enabled again. Traffic was flowing properly over both routers and other network upgrades of normal maintenance were started (replacing some cables and such to various portions to the network.

Traffic began to start showing packet loss due to memory leaks, causing CPU overloads in both routers and engineers began to investigate the problem and identify a solution. It was confirmed that the software upgrade itself had issues, and attempted rollbacks to previous software were unsuccessful. Though not mentioned in the upgrade reports, there was an underlying firmware upgrade that took place, not allowing us to roll back the software to the previous versions, thus moving forward was the only alternative. Patches were done to allow for traffic to begin flowing through the routers once again, while we worked on the problem at hand. Significant packet loss was still prevalent but traffic was flowing. We then worked with Foundry's engineers to have new code installed onto the routers, which was done last evening at approximately 9PM. From the base OS load, patches and tweaks were then continued on throughout the night and still continue as we receive reports of existing packet loss. Though most customers have expressed increasing performance, we are still aware of issues, and resolving them one by one to restore us to previous conditions and allow for the necessary upgrades so that we may continue with planned upgrades to our network.

Though frustrating, we are all doing the very best we can to restore service as quickly as possible. Additional plans are underway for continued upgrades and service offerings, and we will announce later on this week long awaited information as to network and backbone provider additions. Later on today, we expect the majority of all issues to be resolved, though a specific ETA is not available, as quickly as possible is when things will be fixed. We appreciate your continued patience and understanding at this time, and full technical reports will be given once our network engineers are available to discuss the matter, as currently their only priority is to resolve any remaining problems and have the network back to its previous state.
Gemhdar is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 12:59 PM   #46
xdcdave
Confirmed User
 
Join Date: Feb 2003
Location: North East
Posts: 1,911
Why did the techs from Foundry, that were supposedly on site, not bring replacement hardware with them? Why did Sago not have replacement hardware on site?

The amount of downtime we experienced yesterday was completely unacceptable, and frankly, I hope Sago sees a major loss of clients from this preventable outage.
__________________
xdcdave is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 08:39 PM   #47
Gemhdar
Confirmed User
 
Gemhdar's Avatar
 
Industry Role:
Join Date: Aug 2004
Posts: 204
We wanted to take this opportunity to update you and fully explain the events that occurred on our network this past weekend.

Part of operating a large scale growing living network is constant upgrades. Upgrades on the distribution and access layers of our network are quite common and generally uneventful. Upgrades to our core infrastructure are a major task and occur on a major level every 12-18 months. Our core is a fully meshed 60Gbps (6x10Gbps) backbone that directs our peering traffic out to two local carrier hotels. Before an upgrade is made to our core, the following steps are taken:


1.) 1 month out ? General design implementation and conference between our IP Engineering department and manufacture hardware and software experts. During this meeting, so-called ?roll back? procedures are developed and median versions are established in the event of an emergency during the upgrade. Additionally, all major changes to our network are officially frozen until after the upgrade.

2.) 20 days out ? Test implementation using a standby device. For this task, we implement the suggested hardware and/or software changes on a standby switch router and mirror our traffic on this device. Inbound/outbound route and traffic continuity is tested at this point.

3.) 10 days out ? Second conference with manufacture to finalize any potential version mismatches and finalize manufacturer?s commitment to our upgrade as they recommend

4.) 1 day out ? Second ?mirror? test to ensure no conflicts exist in new switch router software.

For this upgrade, all of the above tasks were completed. During step #2, Foundry supplied us with a tweaked version of the operating system for both the switch chassis and the individual 10Gbps cards that handle our long haul transport. On this same CD, they supplied us with a revised hardcopy of the current OS on the switches. This was to act as our emergency ?roll back? procedure. The version of software currently running on these NetIron40G?s was implemented by Foundry and required a live patch to operate properly.

The events of the upgrade are as follows:

1.) Upgrade began. All traffic was failed to our secondary core infrastructure while the primary devices were removed from production and upgraded.

2.) Upgrade to primary core infrastructure was completed

3.) Traffic was failed 100% to the primary core infrastructure. QoS samples were taken via IronView and SolarWinds. Minimum latency, packet loss, and SNMP thresholds were deemed acceptable.

4.) Secondary core was upgraded.

5.) Traffic was moved 100% to secondary core infrastructure. QoS samples were taken via IronView and SolarWinds. Minimum latency, packet loss, and SNMP thresholds were deemed acceptable.

6.) The above simulated a failure on the network (eg, 100% to primary or secondary infrastructure). The next test performed was to release the network as normally operated, therefore, traffic was balanced based on normal operating preferences to the two core devices, essentially 50/50. QoS samples were taken via IronView and SolarWinds. Minimum latency, packet loss, and SNMP thresholds were deemed acceptable. At this point, the core upgrade was considered completed. A few more non service impacting but high risk housekeeping items were completed, specially the installation and migration to some new fiber transport within our facility to accommodate the opening of our new DC3 in Tampa.

7.) QoS measurements were once again taken, at this time our technicians noticed very moderate packet loss and what was deemed at the time to be a slow memory leak.

8.) Foundry was also monitoring the boxes and had begun working the issue. They initially attributed this to an ?affiliation issue? via the OSPF downstream and HSRP lateral relationship between the core clusters. Because of this, all further maintenance was suspended within the window.

9.) The perceived memory leak worsened dramatically over the next hour. Foundry requested that we connect a standby non production switch via a mirror while failing one core out of production, performing a restart, and seeing if this temporarily stopped the leak.

10.) The reboot was performed, but the situation almost instantly became just as severe. Processor load on the outbound 10Gbps interface line cards was pegged at 99% on the rebooted core, and 90% on the unrebooted core.

11.) Processor load intensified to 99% on all outbound 10Gbps interfaces, >90% on the internal interfaces, and 99% on the switch management modules.

12.) Foundry was given 15 minutes to propose a course of action, but since no satisfactory course of action was forthcoming, the decision was made internally to revert back to the old code distributed on the pre-upgrade CD

13.) After steps 1-10 above were repeated, in reverse, traffic stabilized for less than 10 minutes. Amazingly, the high CPU load continued on previously operational code.

14.) At approximately 11am EST Sunday, on the urging on the manufacture, the routers were reconfigured and reinstalled using old configs and old code, however, the problem continued to manifest itself after a few minutes of operation.

15.) Sago management requested that the ?old patched? software be loaded on the routers.

16.) Steps 1-10 above were commenced to load the ?old patched? OS on the routers. The first to be upgraded failed and we found the ASICs refused the old code completely. Foundry investigated this issue and determined that the software upgrade implemented earlier apparently included a firmware upgrade that WAS NOT mentioned during the preupgrade meetings by Foundry.

17.) Foundry committed to produce code that would recreate the older firmware on the 10Gbps cards, thereby allowing a reload of the ?old patched? software instead of the ?old? software provided by Foundry on CD.

18.) While waiting for the production of this software, a Sago engineer noticed that no memory was allocated to IPv4 routes. Further, no CAM page files were devoted to storing routes or buffering for IPv4.

19.) Foundry investigated this finding and determined that only IPv6 allocations were being made. It was acknowledged that this was a typographical error made during the customer production of our software and existed in both the ?old? and ?new? software versions provided on CD.

20.) A conference call was held, and Foundry determined that given currently available resources, fixing the memory allocation issues was more feasible in a short timeframe. We elected to take their advice. The first software revisions became available at approximately 9PM EST Sunday and were implemented within the hour by following steps 1-10 above.

21.) 3 more updates/fixes became available throughout the night. The last was implemented around 6AM EST Monday.

22.) These each gradually reduced the persistent packetloss issues.

23.) The final packet loss issue involved an improperly computed equation causing an irreversible imbalanced on only 2 single Gbps outbound connection. This caused contained packetloss for outbound routes preferring those connections. Essentially, the switches were computing that connection pipes were larger than 1Gbps and trying to force more traffic out them than was feasibly possible.

24.) This fix was implement between 8PM and 9PM this evening.


At this time, all issues are resolved. Any customers still experiencing problems should immediately contact [email protected] as your issue is unrelated to anything above.

We sincerely apologize for the obvious problems this has caused you. Our network?s performance during this incident is unacceptable and contrary to the normal way our company operates. A relentless investigation into why this occurred will continue to ensure that our customers are never subjected to such an incident in the future.

(post was edited and cut as it was too large continued below)
Gemhdar is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 08:40 PM   #48
Gemhdar
Confirmed User
 
Gemhdar's Avatar
 
Industry Role:
Join Date: Aug 2004
Posts: 204
(continued from above)

The following changes have already been implemented:

1.) Mandated the presence of a third-party auditor to review any proposed changes to our core infrastructure by either Sago or any of our manufacturers. We rely heavily on input from the experts that make the network gear we use, but that strategy utterly failed us in this incident. A search was begun today for a firm capable of readily assisting us with a network as robust and large as ours.

2.) A full audit by the above mentioned firm by the end of the month of all custom software provided by or implemented by any manufacturer in our network.

3.) A requirement that any manufacturer providing onsite or telephone assistance for a schedule network window include enough personnel to reserve a 24hr shift of engineers that were available during the planning phases of an upgrade. Fatigue was an issue yesterday.

4.) An internal requirement that no more than 50% of senior engineers be involved in any single task within an 8hr period. Again, this is to combat fatigue.

5.) While we have standby devices onsite, we are now requiring any mirror?ed traffic tests to include all devices at that layer in the network interacting with each other.

6.) A purchase was made today for packet generating hardware capable of simulating our load for testing purposes. Previously, packet generation was done on a scaled down basis to test mostly DDoS related safeguards prior to implementation.

This incident, like all our upgrades, was planned in excruciating detail. The upgrade was supposed to be a

short, painless operation and it was widely believed by both Sago and Foundry engineers that this could be accomplished without any impact to customer operations. A wide network upgrade window was only scheduled as a formality, given the gravity of the changes being made.

We cannot stress enough that everything in our power will be done to ensure there is never a recurrence of this event. Network performance as of late has not been near our company?s expectations or capabilities and it is impossible to express the level of commitment our staff has to showing how well our network will perform in the future.

As many of you are wondering, there was a reason for this upgrade. While many of these upgrades were slated to be announced later as we DO NOT HAVE DEFINITIVE INSTALLATION TIMEFRAMES (please read the emphasis on this) from our transport carriers, these changes will be implemented beginning December 1st and finished by a projected end of Q1 06.

So, as an early announcement, please know that we are implementing many of the following projects:

1.) Local peering for our Tampa datacenter with 17 other carriers, including 3 backbones.

2.) Transport between our Tampa facility and new 100,000 square foot Atlanta facility

3.) Merger of our Tampa and Atlanta Datacenter networks via dark fiber.

4.) Addition of 4 of the following backbones (final announcement will be made later this week): Savvis, Global Crossing, BTN, Telsia, and/or Cogent (for onnet Cogent traffic ONLY).

5.) Establishment of peering with 10+ other providers and 2+ carriers in Atlanta.

The following project is currently underway with an unknown timeline:

Implementation of a hard line network to New York City ? 60 Hudson to establish peering and better European transport.

Once completed, these changes will give us what we feel is one of the most stable and high performance foot prints of any provider in our competitive spectrum. No amount of announcements or press releases will prove this to you after the events of the last few days, only performance. That will be our primary objective over the coming weeks and months.

In the meantime, we can only offer our assurances and continue to update you as to our findings related to this incident. Our extensive apologies can only go so far, so we look forward to the opportunity to prove our abilities to you. If you have any questions, please contact me directly.
Gemhdar is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 08:56 PM   #49
d00t
Confirmed User
 
Industry Role:
Join Date: Sep 2002
Location: In your mind
Posts: 3,766
Let me be the first to say ... isnt that great.. but what are you going to do for clients now? At the end of the day nobody cares what proceedure you used...we just want our boxes accessible. 27 hours of downtime/extreme packet loss is totally unacceptable, regardless of how planned out it may have been or not.
d00t is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 11-14-2005, 09:02 PM   #50
chaze
Confirmed User
 
Industry Role:
Join Date: Aug 2002
Posts: 9,752
Sounds like your growing fast. My experience is shit happens but you have to answer the phone when it does.

last time we had a dns error and a huge part of our network was effected I had a couple friends come down to help cover all the phone lines. The thing never stopped but atleast everyone was on the same page.

Glad everything is back.

We are also looking to buy another cage, we have plenty of room in LA but I think having another location "maybe with you guys in FL" will help out in case of emergancies.
chaze is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Post New Thread Reply
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >

Bookmarks
Thread Tools



Advertising inquiries - marketing at gfy dot com

Contact Admin - Advertise - GFY Rules - Top

©2000-, AI Media Network Inc



Powered by vBulletin
Copyright © 2000- Jelsoft Enterprises Limited.