Review of google
http://www.searchengineshowdown.com/...le/index.shtml
Google has become for many the pre-eminent Web search engine. In Feb. 1999 it moved from Alpha test version to Beta and on Sept. 21, 1999 it officially launched.
Since that time it has made its mark with its relevance ranking based on link analysis, cached pages, and aggressive growth. Since its beta release, it has had phrase searching and the - for NOT, but it did not add an OR operation until Oct. 2000. In Dec. 2000, it added title searching. In June 2000 it announced a database of over 560 million pages, which grew to over 600 million by the end of 2000 and then 1.5 billion in Dec. 2001. The 2+ billion reported on their home page as of April 2002 includes indexed pages, unindexed URLs, and other file formats. By Nov. 2002, they moved their claim up to 3 billion. Use the table of contents on the left to navigate this review.
Databases: Google offers its own database of indexed Web pages along with another collection of URLs that it has not indexed such as duplicate URLs, redirected URLs, pages protected by a robots.txt file, and pages with access restrictions. Results from this latter database are relatively rarely seen but can be identified by the lack of an extract, size, and cached copy information in the results. Its regular index was expanded in 2001 to include additional file types: first PDF files, and then .ps, .doc, .xls, .txt, .ppt, .rtf, .asp, .wpd, and more. See Google Database Components for more details. Google has an image database and a Usenet news database known as Google Groups. Their News Search, Catalog Search, and Froogle (a shopping search) were all still in beta as of Aug. 2003.
Google also has a PageRank version of the Open Directory, and above their regular results, hits from their own news headlines database, stock quotes, and a phone number database may display. In addition it offers several specialized subsets: a government database of the .gov and .mil sites; University searches; a Linux search; an Apple/Macintosh search; and a Microsoft search.
The Google database is used by AOL, iWon, at Netscape's Search site, as the back-end search engine at Yahoo!, and the Weather Underground among many others. Yahoo! switched from Inktomi to Google in July 2000 and reaffirmed and more closely integrated Google results in Oct. 2002. BBCi used Google from May 2002 until March 2003 when they switched to Inktomi.
Strengths:
* Size and scope: It is now the largest, and includes PDF, DOC, PS, and many other file types
* Relevance based on sites' linkages and authority
* Cached archive of Web pages as the looked were indexed
* Additional databases: Google Groups, News, Directory, etc.
Weaknesses: See also the Google Inconsistencies Page
* Limited search features: no nesting, no truncation, does not support full Boolean
* Link searches must be exact and are incomplete
* Only indexes first 101 KB of a Web page and about 120 KB of PDFs
* Site clustering is difficult to turn off
Default Operation:
Multiple search terms are processed as an AND operation by default. Phrase matches are ranked higher
Boolean Searching:
Google uses an automatic Boolean AND between terms and has slowly been moving towards more Boolean support; however, it does not yet support the AND operator, NOT operator, or full Boolean searching with the ability to nest operators. In Feb. 1999, Google added the - symbol to perform a NOT function. In Oct. 2000, they added the ability to use an OR (which must be in upper case) to do some Boolean OR operations. See the Boolean Searching on Google page for more details on how to get Google to do certain kinds of Boolean searches.
The + used to be able to be used to require a term, but since the default operation was AND, the + was never really needed and for a while caused the following message to appear:
Google always searches for pages containing all the words in your query, so you do not need to use + in front of words.
However, the + can be used for forcing a search on stop words.
Proximity Searching:
In Feb. 1999, Google added phrase searching designated in the usual manner by enclosing the phrase in "double quotes." Google also detects phrase matches even when the quotes are not used and usually ranks phrase matches higher. No other proximity searching is directly available. However, using the wildcard word within a phrase trick described below, the unofficial Google API Proximity Search tool can reproduce proximity searching up to a distance of 3 words.
Truncation:
No truncation is available nor is there any automatic plural searching, word stemming, or other related features. However, within phrases, there is a trick which can be used for a wildcard word. Use an asterisk * within a phrase search to match any word in that position. So, for example, to find "a little neglect may breed mischief" when you are not sure of the second to last word, search "a little neglect may * mischief". Multiple asterisks can be used as in "a little * * * mischief". This is the only way Google supports a wildcard symbol.
While not exactly truncation, the synonym operator of a tilde ~ before a search term, with no space, to tell Google to look for synonyms. So a search on yosemite ~trails will find pages that have terms like 'hiking,' 'rides,' and 'maps.' This synonym finder will sometimes include plural, singular, or other grammatical variants as well. So the earlier search also found matches with 'trail' and 'trailer.' So the ~ can be used to get something a bit closer to truncation but not very. Bear in mind that the ~ only works in Google's Web database and only for English language terms.
Case Sensitivity:
Google has no case sensitive searching. Using either lower or upper case results in the same hits.
Field Searching:
Google offers several field searches connected with entering URLs. In the December 2000 revision of its advanced search form, it add several title and URL field searches.
Note that most field searching cannot be combined with other query words. In others words, a search entered such as uniqueword link:name.com will only be processed as if only the field search was present as in link:name.com. The uniqueword is ignored. One exception is the site: limit which cannot be used alone and must have another search term. The more recently added intitle: and inurl: fields can be combined with other search terms.
Field Explanation
intitle: Finds pages that have the term(s) in the HTML title element. Can be combined with other search terms. intitle:search engines. This should find 'search' in the title and 'engines' anywhere in the page.
inurl: Finds pages that have the term(s) somewhere in the URL (host name, path, or filename). Can be combined with other search terms. inurl:searchenginewatch.
allintitle: Finds pages that have the term(s) in the HTML title element. allintitle:search engines.
link: Finds pages which contain hypertext links to the exact specified URL. link:notess.com/search finds pages with links to this site.
allinurl: Finds pages that have the term(s) somewhere in the URL (host name, path, or filename). allinurl:searchenginewatch.
site: Finds pages from the designated Web site. Paths and file names cannot be included. An additional search term must be used. Try a term from the domain name for the most comprehensive results. notess site:notess.com finds how many pages Google has index or listed. As of April 2000, this was an undocumented search feature.
allinanchor: Finds pages that have the term(s) somewhere in the links to the page. .
related: Invokes GoogleScout to find other pages similar in linkage patterns to the given URL and at a similar hierarchical level. The URL must be exact. In other works related:notess.com and related:
www.notess.com find different results.
flink: Used to find pages linked from the given URL. No longer working as of Oct. 30, 1999. flink:notess.com
Before the official release in Sept. 1999, clicking the small bar graph at the beginning of a displayed hit would automatically run a link: search, but that graphic disappeared with the official launch. Another field search which can be used is related:[URL] which invokes GoogleScout to find other pages similar in linkage patterns to the given URL.
Limits:
Google has language, domain, date, filetype, and adult content limits. The date limit, added in July 2001, is only available on the Advanced Search page. Only three options are available: Past 3 Months, Past 6 Months, or Past Year.
The file type limit, added along with the addition of other file types to the Google index, was added to the Advanced Search page in Nov. 2001. The Advanced Search page only offers file type limits under the label of File Formats for PDF, Word (.doc), Excel (.xls), PowerPoint (.ppt), and Rich Text Format (.rtf). Using the filetype: prefix, the file type limit can also be used for PostScript (.ps), Text (.txt), .htm, WordPerfect (.wpd), and other file extensions. To use the prefix command, just put the extension immediately after filetype: as in differentials filetype:ps.