![]() |
Quote:
Excellent post dude. :2 cents::2 cents: |
I agree with whoever said that you guys think too much. For me, getting a quality host is not something that can be determined on web board, or by any means other than actually setting a box up on the network and seeing how it does. All of the fancy devices in the world don't mean nothing if at the end of the day the network itself sucks. The only way to truly find out if a network sucks is to put a box on it and see how it does.
|
Thanks for the hot tips toejo:thumbsup
|
very nice post brad.
- the only reason I'm not hosting some stuff with brad right now, I was too busy to call back earlier this week. info like this makes me think "First Thing Monday" :) |
Hi Everyone
I was out of town skiing since Friday and just now finally have the time to respond to all of the posts. This is a very detailed response and for those of you who take the time to read it I thank you in advance! Cheers, Brad ------------------- Hi Sam, Just a few comments on your post, because you seem to have some misconceptions as to what the FCP does and does not do. >> However, unfortunately the FCP is not the be-all end-all of network performance - it's not a panacea, nor is it >> even adequate on its own. It is certainly not a replacement for qualified, top-tier network engineers. First off, I totally agree with this comment. The FCP is certainly not a "set it and forget it' appliance like a Ronco oven. It certainly cannot and does not replace the need for skilled network engineers, quality equipment, proper network design and a diverse mix of quality transit providers and peers. If you simply expect to just plug in an FCP (or any other intelligent traffic engineering device) into a poorly designed network lacking substantial diversity and having a poor native routing policy, you really would not get value out of the device. The FCP's real value proposition is for multi-homed networks that have the bases above covered. It simply augments an already solid network environment by providing a level of automated, real-time, qualitative route analysis and policy adjustment that can simply not be matched by nearly ANY level of manual (human) effort -- no matter what your budget may be for traffic engineers. Not to get overly technical here, but the ugly truth here is native BGP4 could care less about actual path performance. There are no objective metrics built into the protocol. It is not aware of path latency, link speed, congestion, errors, etc. -- it is only aware that path exists, or it does not. Simply having a big mix of tier-one carriers without properly optimizing your tables is buying you very little. The only real clues you have (and they're not very good) in evaluating the anticipated performance of different transit paths to reach a given destination are to consider 'hop' count (AS-Path length) and potentially also to consider MEDs received from transit carrier networks. Problem is, neither of these is a generally a reliable indication of latency or other real link quality attributes at all. Both are able to be manipulated by outside forces for unknown reasons (e.g. ...Did someone prepend their AS-path 5x via carrier "x" because its of poor quality, or because its expensive and they don't want to use it?). Did your upstream provider lower the MED he sends you on such and such path because its better, or cheaper? Bottom line is, for the most part, traffic engineers don't have a whole lot to go on (performance-wise) without using tools external to the protocol itself to make any real effective judgments when trying to optimize native BGP policy for better end-user performance. Other than making sweeping large scale generalizations like "Provider X has really poor quality coverage of Europe, so lets discourage egress to all the RIPE issued blocks via that carrier." It is a fairly tedious process to sniff out poorly performing paths and move them to another carrier, especially not really knowing if 10 minutes after you moved traffic off of carrier "X" onto carrier "Y" to reach a given destination that the original carrier repaired a circuit or added capacity, etc. Now perhaps that path you just changed would have been better off left alone. You'll never know unless you happen to look at it again. The reality is that other than those sweeping generalizations which are often not beneficial at all -- most deliberate routing policy changes made by traffic engineers even in the largest and most highly skilled & staffed network providers are done reactively in response to either a large scale (obvious) incident, or based on the complaint of a customer which triggers an internal analysis that discovers "hey... we do have a much better way to reach network "Z"... etc. You would quickly run out of money if you tried paying a bunch of traffic engineers to sit all day looking at your active traffic using netflow or a sniffer etc. and then making series of surgical adjustments to your routing policy in order to guarantee your customers you're giving them the best product you could given your carrier mix. It just simply isn't done. This is the beauty of an automated route analysis/control platform like the FCP. It is actually looking in real time at real conversations taking place between hosts on your network and a remote network. If the conversations are significant in number, size, or duration out to a specific ASN, an answering host within that network is then 'flagged' for analysis. The FCP then 'probes' (similar to a UDP traceroute) ALL your available transit provider links and obtains their current performance characteristics (latency, PL, etc.) in reaching that destination network. If a significantly better performing link is found in your mix of carriers than the one you're already using, and if by moving the traffic over to that link won't oversubscribe any links or kill you on costs relative to your own 95ths, then its moved. If an hour later those conditions have changed and now its another provider that offers the best performance, then its moved again. It gets even better. Lets say that an upstream provider created a blackhole condition in reaching a given remote network. If the FCP notices a sudden increase in packet loss on a existing conversation between a host on your network and one on the recently blackholed network (remember, it is getting SPANed <Port Mirrored> traffic of all of your physical provider links for analysis) it will instantly check if the host is reachable via any provider and move away from the blackhole if that's an option given your other available carriers. Now, what the FCP does not do: It has NOTHING to do with ingress routing, only egress. It cannot make your ISP send traffic to us in a way that it doesn't want to. Through native policy, I can try to influence how your ISP might send me traffic (such as in the example earlier by prepending my ASN, etc.) but your provider can choose to ignore this, or treat it differently than my intent, etc. So, on to your examples -- all of which are examples of ingress routing from our point of view: 1) Your traceroutes indicate that your provider is either Quest itself, or is forwarding traffic to Quest on the very first hop (can't see since you removed from post). It appears that Quest has no direct peering in Tampa, FL (not unusual) -- so, they are backhauling your traffic into Washington DC. That seems to take us around 40ms or so to get from Tampa to WDC and bop around a few hops until we get to Quest's edge router with L3 at hop 5. Now, on L3's net, we bounce around DC some more and then pass through Atlanta on our way back to Miami -- with a total RTT of 64ms. 64ms, given this geographic path does not seem that unreasonable to me. However, it does seem that the lions share of the path latency is in fact on the outbound Quest leg from Tampa to DC. The return using L3 from DC into Miami is almost half the latency, but again, in my opinion, both are reasonable given the circumstances. If by chance you were looking at the reported latency figures in hop 12 on the first trace -- your traceroute just happened to coincide with the execution of a high-priority process taking place on one of our routers CPUs (likely BGP Scanner) which is giving you the 'false' latency figure. You'll notice that it completed by the next hop and your end-host latency is reported at the more believable 64ms. (same thing goes for hop 8 of your second trace -- which is NOT to or on our network.) Again, this is all ingress routing from our perspective, so the FCP is simply not involved. If you're looking for better performance from your network into Miami, perhaps you should consider another provider or try talking Quest into adding local peerings. 2) I'm not sure what the traces to the other hosts accomplish. They are to completely different networks/geographies, and honestly none of the RTTs are that bad. Although as you indicate, Tampa is indeed closer in physical distance to Miami than it is to New York -- however networks are often not engineered as the crow flies. In this particular case, Quest is putting your outbound traffic to us on a slightly latent link to Washington and then handing it over to L3 who turns around and sends it to Miami by way of Atlanta in half the time. 3) Just for kicks, I checked our path options to reach you (or as close to you as I could get lacking your IP address. Specifically, I did a manual path analysis to Quest's router in Tampa (tpa-core-01.inet.qwest.net <205.171.27.101> using the FCP on our network, and have some interesting results to share. Now, I will first point out that this particular path (to reach 205.171.0.0/18) is NOT currently being engineered by the FCP, likely because we're not exchanging sufficient traffic with that network. So, based on our native policy, we are currently routing that prefix via Level3, and our path RTT is 64ms -- same as yours on the way in. (looks like the identical path in reverse for that matter). (continued...) |
Interestingly enough, however, we do have a better option on the table that we could (and would) automatically use if we started exchanging any significant traffic. In fact, we can almost cut that latency figure in half if we would have used (believe it or not) PCCW Global to send the traffic, which can reach that router in a mere 33ms. Our runner-up carrier would be another long-shot guess for most -- Cogent at 43ms. (See the graphic I'm pulling these stats from for yourself):
[URL="http://www.mojohost.com/gfy/LG-Output.JPG[/URL] So, Thank you for your example traceroute -- I couldn't have scripted this scenario any better. This precisely illustrates the value of the FCP in our network. BTN just happened to have a direct peer in Dallas one hop away with Quest that seems to be a lot closer/faster to the source than WDC/L3 is. Would you have anticipated that would have been our best choice to reach that network? Not likely. Be honest -- it was even said in this thread that the general understanding that many have is that BTN has 'crappy coverage' of the East Coast. Is there any way your traffic engineer would have found that on his own and changed your policy to use that path without you or someone (or something) asking him to? Not likely at all. Finally, you mentioned: >> Quote: >> Originally Posted by Maz >> A host not using provider pricing as primary metric when making routing decisions, very hard to believe, but >> maybe there are still good people left in this >>> Interesting that you mention this - given that the one of the FCPs main selling points is the ability to make >>> routing decisions based on cost (you enter cost per megabit per provider, and what you want your total cost to >>> be) rather than the best route necessarily. E.g. it intentionally facilities not necessarily using the best >>> route and utilize a cheaper "acceptable" route. Of course the FCP is also taking link utilization and bandwidth cost into account. If it didn't, consider cost, it would be returned promptly to the cardboard box it came in. ;) This response is already way too long, so I won't go into too much detail here other than to say that the FCP collects metrics on all probed paths and then performs multiple passes of logic to arrive at a list of optimizations (path changes) it wants to make. In our configuration, our box is in what is called "Performance Sensitive" mode -- meaning that the primary (first) pass looks strictly at performance data and does not consider the cost of bandwidth. Once that list is generated, another pass determines what the cost implications of implementing that list of changes are, and if it would push an 'expensive' link over a desired threshold, it will see if there any traffic currently on that link that could go elsewhere without much performance penalty in a 'swap' if you will. In some rare cases, is it possible that we would continue to use a link that gives us a 45ms RTT when we have a choice to drop to 40ms, but it would violate a 95th tier and increase a carrier bill by a few hundred bucks? Yes, it is possible. The farther apart those numbers get, however, the more insistent the FCP is about figuring out a way (a swap) to use the better link without violating the link's cost tier. The reverse is also true, sometimes we just have to live with a higher bill because of large scale problems with one of our carriers and the need to divert traffic en-masse off their link until its resolved. Although quite valid, I think your question assumes a tighter relationship between provider cost and provider quality than often exists in reality. Most of the time the FCP has no problem finding both very good and very bad paths to networks over every provider we have. Look down on carrriers like Cogent or BTN all you want and praise ones like L3 and ATT and GBLX. If we were just single or dual-homed, I can promise you I'd be picking from the latter group muself. However I can tell you from experience that all carriers all have their good and bad points, and can back up my experience with some hard data if your're interested. Every carrier has value -- its just sometimes a lot of work to find out what it is. The FCP just makes that job much easier and more accurate. ;) Sincerely, Brad Mitchell |
reading this today...thanks Brad
|
Looks like I messed up the tag in that last post..
Here's the image: http://www.mojohost.com/gfy/LG-Output.JPG: |
Good on you for posting those stats... and thanks for making another fine deal with me today. :)
|
Some good information in here.
I know I learned something. My host is comfortable enough to go skiing for the weekend. Knowing that he knows how demanding I am, he must really trust his staff and equipment. ;) |
Quote:
Quote:
I did have my cell phone handy at the top of the ski hill just in case you wanted to call me for kicks, though! Brad |
...Bump for those who thought they had valid criticism.
Brad |
Quote:
|
hello, lol hehehehe
|
<-- proud to host with mojo :)
|
wow quite informative Brad, you rock! thanks for sharing :thumbsup
|
why does everyone assume that brad "bought" me lapdances. christ i mean i got them from him personally.
ROFLLLLLLLLLLLLLLLLLL |
I'm waiting patiently for SMachiz and Post Rat in Hell to revisit this thread and find some humility. I'm not expecting the latter, who would here on GFY, but I like to leave room for the possibility as people can sometimes be pleasantly surprising.
Brad |
Bump for the haters.
Brad |
Bump for more haters
|
I am so dissapointed they're not coming back to the thread. :/
Brad |
Quote:
these fuckers on their high horses are the dregs of the industry. fuck them. you and i have more passion for what we do in our pinky toes than they do in their entire bodies. fuck em man. fuck em. |
Tell how you really feel, don't hold it in
|
Hi, Brad -
Once again, an excellent write-up. Clearly, the negative parts of your audience have little to go by, but at least it's gotten this thread a lot of positive attention in return! Your knowledge in specialized fields like this are what sets your company apart from many others. Again, good job. |
I'm bookmarking this thread to read later. I'd really like to learn how this works.
Thanks for the post Brad. |
Thank you everyone for the positive feedback. I'm really pleased that so many took the time to read the details that I have posted here. For those with more questions or concerns, please feel free to make them known in this forum or contact me offline by any means desired :)
Looking forward to seeing those of you going to Vegas! I will be there bright and early tomorrow. Sincerely, Brad Mitchell |
Hiiiiiiiiiiiiiiiiiiii Brad!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
The haters have officially been deemed own3d :thumbsup
Here kitty kitty. Brad |
meoow......no hater here, Great post......how could I have missed this one :eek7
|
Bump for a quality thread.
|
All times are GMT -7. The time now is 05:02 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123