How Google find web pages according to search query ?


If your computer , mobile or laptop have Internet connectivity then you have knowledge of ocean and search engines like Google and yahoo are the tool to find out exact info from the ocean of Internet. when you search on Google , you are almost instantly presented with a list of results from all over the web. Lets know how does Google find web pages matching your query , and determine the order of search results?

Google does following three key processes in delivering search results.


1.Crawling-:

Crawling is a process by which Google bot discovers new and updated pages on the web ( or crawl) billions of pages on the web . the program that does the fetching is called Google bot( also known as Robot , Bot or Spider) Google bot uses an algorithmic process : computer program determine which sites to crawl , how often, and how many pages to fetch from each site . Google Crawl process pages with a list of web pages URLs , generated from previous crawl process and augmented with site map data provided by webmasters as Google bot visit each of these websites it detects links on each page and adds them to it's list of pages to crawl. new sites , changes to existing sites and dead links are noted and used to update the Google Index . Google doesn't accept payment to crawl a site more frequently and Google keep the search side of it's business separate from it is revenue -generating Adwords service.


2.Indexing

Google bot processes each of the pages it crawl in order to compile a massive index of all the words it sees and their location on each page . in addition Google processes information included in key content tags and attributes . Google bot can process many but not all , content type for example Google cannot processes the content of same rich media files or dynamic pages.


3. Serving Results

When a user enter a query , Google's machine search the index for matching pages and returns the results. Google believe are the most relevant to the user . relevancy is determined by over 200 fetches , one of which is Page Rank for a given page , pagerank is the measure of the importance of a page based on the incoming links from the other pages,. In simple terms , each links to a page on your site from another site adds to your sites pagerank. not all links are equal. Google works hard to improve the user experience by identifying impact and other practise that negatively impact search results. the best types of links are those that are based on the quality of your content.

In order for your site to rank well in search results pages . it's important to make sure that Google can crawl and index your site correctly by following Google's webmaster guideline you can improve page rank of your site.

Google's related searches , spelling suggestion and Google suggest features are designed to help users save time by displaying related terms, common mis spelling and popular quires the key words used by these features are automatically generated by Google's web crawler and search algorithms . if a site ranks well for keywords , it's because Google have algorithmically determined that it's content is more relevant the user query.

To see indexed pages in your site , use the site operator , like this site : Google.com.( note do not use space between the operator and google.com the URL) you can perform the search on a whole domain or limit it to a certain sub domain or sub directory for example site:google.com/webmaster to exclude pages from your search , use i minus sign before the operator for example the search site:google.com-site :adwords.google.com gives you all the indexed pages on the Google .com domain without the pages from adwords .googler.com . the catch operator shows you an achieved copy of a page indexed by Google for example cache:google.com displays the last indexed version of the Google homepage , along with the information about the date the cache was created . you can also view a plain text version of the page . the is is useful because it shows how Google bot sees the page . if you do not want searches to be able to access a cache version of your page , use the non archive meta tag like this

this page will be crawled and indexed by Google , but user will not see a cache link in the search results.

Previous Post Next Post