Writing Powerful Programs in C and C++
Using Technologies to avoid slow Hard disk Operations (Big Files)
Using Stuff like Lexicon, Word IDs, etc ..
Every word found in the document is turned into an integer number (wordID)
When somebody searches for "gay seals" it's turned for example into 123 and 653. Then Google looks for all documents matching this two Word IDs. This way they get around SLOW string comparisons ....
To rank the site Google uses a Hitlist.
For every document found it calculates the IR. If "seals" is for example in Hn tags it's a greater Hit as if it would be in normal P tag ...
Then it uses factors like PR, IBLs, anchor text, proximity of keywords and a lots of stuff we do not know to give it the final touch
This powerful piece of software is than run on 1000 of cheap servers ...
Full Paper