Quote:
Originally Posted by dazzling
I want to keep it as simple as possible with the minimum load on a server.
|
Just to save you time, you will need more than one server, even if you use a homogeneous search engine/crawler you won't get away with even a 100 million page index with just one server, unless you plan on only crawling say 10,000 pages a day. Think of the processing needed for things like clone detection, caching etc.