Hiphop is Facebook's attempt to speed up PHP. It is successful to some degree but I think for the most part it is not the best way to go about it if you have the choice. If you have lots of legacy PHP it might be a good way to go about it though. In Facebook's case it makes a lot of sense for them. Although they are switching to other languages. For example, they use Erlang for their chat architecture.
Hiphop is good in cases of legacy PHP code and in situations that are CPU intensive like data mining or machine learning.
Another thing to note, it's not actually compiled PHP. It converts PHP source code to C++ source code. When going from a high level to a low level there are going to be inefficiencies that you don't get the full benefit of the lower level language. A good example would be having an automatic with a slap stick shift (forget what they are called). You get some gain, but not having a clutch prevents you from certain niche situations like blipping the throttle, changing the rotational speed of the flywheel, or skipping gears.
Caching is probably the best way to start. There is something called APC cache. What it does is it takes the PHP source code and compiles it to the PHP bytecode used by the PHP virtual machine. Next time the page comes around it can just execute the code instead of needing to compile it again. It is still running interpreted bytecode but it skips the compilation phase.
Even better is to cache the contents, ideally the whole page, but if not you can also cache fragments and store then in APC cache or something like memcache.
Nginx and pound can be used to cache transparently as well. Actually nginx is much much faster and uses less memory than Apache. Direct I/O is something to look into as well if you are serving lots of content. If you have any kind of internal communication going on use local unix domain sockets instead of TCP. They have less overhead.
If that fails then you can look into using a faster language / environment to handle just the parts that need it.
Java is actually decently fast (only 2-3x slower than C) compared to PHP (50x slower than C). But if you need raw speed nothing beats C or assembly.
People who can optimize assembly now are pretty rare though. Most programmers don't know the relative speeds of accessing something from magnetic disk, SSD, memory, L2/L1 cache, memory lines, and registers. To give an idea of how extreme it is: Accessing registers is like grabbing a piece of paper on the top of your desk. L1 cache is like opening a drawing and pulling out a paper. Memory (RAM) is like going to an office in another department and asking someone else for the document. Accessing a hard drive is like booking a flight and flying to another country and back to get the document.
What that means is most of the bottleneck (especially with lots of static file content like in adult) is going to come from your hard drive.
I know a guy that needed to serve a particular file in high volume so he wrote a custom web server. He knew that if the client was connecting they should only be asking for 1 file (because it is the only one they put on that domain). So the custom web server sent raw bytes from memory as soon as the connection was opened. It didn't even bother parsing the request because he already knew what they wanted. It was also stored in shared memory so it didn't have the normal double buffering problem of having to copy to a kernel buffer first.
If you build your architecture right you should still be able to use PHP. As long as you are not doing anything really CPU intensive your bottlenecks will be your hard drive and how fast the network card can write. Upgrading memory and switching to solid state drives typically has the most improvement for the least amount of effort.
|