Quote:
Originally posted by jpoker
jpv, you can indeed do this. There are several advantages to setting things up this way even if one given server
can handle the load. This is especially true
if you are hosting with places like servermatrix where a
given server with 1200 GB is really cheap, but overage
is expensive.
1) if one of your servers has a hardware failure you still
have the other one in operation and can direct all
the load to it.
2) when your traffic does grow you already have
a plan in place on how to scale.
There are some disadvantages though.
1) you'll need to keep your content mirrored across
the two (or more) servers. rsync over ssh is probably
the most common method of doing this and is
quite fast.
2) if you are using DNS round robin to distribute the load
you can run into some unsual issues if you have approx
23 or more servers in your rotation. The DNS system
normally uses UDP packets to communicate. However,
if the response from the DNS server is over a certain
size, the communication switches to TCP. Now,
this shouldn't pose a problem, but for some reason
it does. I found that when i exceed 23 servers in
my rotation that some people would start complaining
that they couldn't reach my site. I never figured out
the exact cause though and why it only occured on
some computers and not others.
3) 2 or more servers means more server administration.
As for your question regarding TGP/MGP scripts... to the best
of my knowledge, i have never had a gallery rejected
because it is on a load balanced system.
- jpoker
|
I wouldn't recommend the DNS method of load distribution. DNS entries cache, and tend to produce 'clumps' of hits... not very evenly spread. There's a couple of ways to get load spread out nice and evenly across multiple servers.
The expensive way is to invest in a hardware load balancer. There's a variety of them on the market but they all basically do the same job: Take in hits on a virtual IP address and farm them out to real servers inside your network.
Advantages: They're standalone bits of hardware. All they do is load-balance stuff. You can usually point MRTG at them and get statistics too, if you like pretty graphs. Many come with pretty front-ends so setup is a breeze.
Disadvantages: Can be a substantial cost to small-medium webmasters. Fault-tolerant operation requires multiple units, multiplying the cost.
The cheaper (and in my opinion, more flexible) way is to run a unix box with virtual server support. It's free software, and unless you're pushing a WHOLE bunch of traffic (500mbps+) it'll handle your load easily assuming you put the software on a box with sufficiently advanced hardware. You can easily get 250mbps+ out of a stock P4 Dell 1u box with 2x GigE copper ports.
Advantages: Cheap, and very flexable. If you have a resident geek, he can make this sort of setup do backflips.
Disadvantages: More difficult to set up. Will require someone with reasonably advanced technical know-how to get it running (although it's a fire-and-forget type of tool that requires very little maintenance once set up).
Load balanced servers do require data synchonization. Rsync works but multiplies your data storage requirements (not an issue if your data set is small). Another alternative is to centrally locate your files and mount them as NFS, so changes to the master immediately take effect on the front-end slave boxes.
The only other caveat is data storage. If you have scripts writing data to a file local to the machine when submissions are made, then that submission won't be known to other boxes in your load group until you sync them. If submissions come from 2 or more machines in your group simultaneously, you could lose data in a 'collision'. Scripts which use central database stores (postgres, mysql, oracle etc) don't have these problems. If your script uses MySQL, chances are this won't be an issue for you.
Hope this helps some, and doesn't just confuse you more.
