This is fucking PRICELESS!
3. PageRank. The IB(P) metric treats all links equally. Thus, a link from the Yahoo home page counts the same as a link from some individual's home page. However, since the Yahoo home page is more important (it has a much higher IB count), it would make sense to value that link more highly. The PageRank backlink metric, IR(P), recursively defines the importance of a page to be the weighted sum of the backlinks to it. Such a metric has been found to be very useful in ranking results of user queries [Page 1998.2]. We use IR'(P) for the estimated value of IR(P) when we have only a subset of pages available.
More formally, if a page has no outgoing link, we assume that it has outgoing links to every single Web page. Next, consider a page P that is pointed at by pages T1, ..., Tn. Let ci be the number of links going out of page Ti. Also, let d be a damping factor (whose intuition is given below). Then, the weighted backlink count of page P is given by
IR(P) = (1-d) + d ( IR(T1)/c1 + ... + IR(Tn)/cn)
This leads to one equation per Web page, with an equal number of unknowns. The equations can be solved for the IR values. They can be solved iteratively, starting with all IR values equal to 1. At each step, the new IR(P) value is computed from the old IR(Ti) values (using the equation above), until the values converge. This calculation corresponds to computing the principal eigenvector of the link matrices. PageRank is described in much greater detail in [Page 1998.2].
One intuitive model for PageRank is that we can think of a user "surfing" the Web, starting from any page, and randomly selecting from that page a link to follow. When the user reaches a page with no outlinks, he jumps to a random page. Also, when the user is on a page, there is some probability, d, that the next visited page will be completely random. This damping factor d makes sense because users will only continue clicking on one task for a finite amount of time before they go on to something unrelated. The IR(P) values we computed above give us the probability that our random surfer is at P at any given time.
from:
http://web.archive.org/web/200008181...crawler-paper/