|
I've been observing the similarity filters and dup content thresholds on G and other SE's over several generations of algos.
G's algos for flagging pages as being similar are quite sophisticated. For example, pages made up of small snippets from other pages, however random, at some point start being seen as similar.
This applies to links pages which use the same titles and descriptions (as everyone else). Those titles and descriptions get 'burned' and they are not considered unique any more (when they are all over the web).
The most assured way to minimize tripping dup/similar filters is to write each page's text as a new piece of work. If you have multiple pages which link to a common page and your links include descriptions, use original, new descriptions for each link on every page. Even so, when you start cranking out hundreds and thousands of pages, statistically, you will start repeating and while you may not see it from within your site, an algo can trip a filter across large numbers of pages.
In summary, a web page is considered (by SEs) as:
a) content page - when most of the text content on the page is original (not repeated anywhere else on your site or others) and with minimal off-site links.
b) hub page - when there is a high ratio of off-site links to actual original text content. Link lists are examples of hub pages.
c) both - when there is a mix of off-site links and original content.
From my experience, it seems to be best to create either content or hub pages. They are more effective for what they are.
In any case, when linking to internal and offsite pages, it is important that you link to pages which relate to the theme of the page you are linking from.
-Dino
|