View Single Post
Old 03-26-2004, 04:52 AM  
nastyking
 
Join Date: Nov 2002
Posts: 2,174
Writing Powerful Programs in C and C++

Using Technologies to avoid slow Hard disk Operations (Big Files)

Using Stuff like Lexicon, Word IDs, etc ..

Every word found in the document is turned into an integer number (wordID)

When somebody searches for "gay seals" it's turned for example into 123 and 653. Then Google looks for all documents matching this two Word IDs. This way they get around SLOW string comparisons ....

To rank the site Google uses a Hitlist.

For every document found it calculates the IR. If "seals" is for example in Hn tags it's a greater Hit as if it would be in normal P tag ...

Then it uses factors like PR, IBLs, anchor text, proximity of keywords and a lots of stuff we do not know to give it the final touch

This powerful piece of software is than run on 1000 of cheap servers ...



Full Paper
__________________

Last edited by nastyking; 03-26-2004 at 04:54 AM..
nastyking is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote