December 09, 2008

Crawling vs Indexing

We do a lot of searches monthly, and we do that by putting the search term into the Google search box, and just on the press of the button we are displayed by a zillion search results based on the popularity of the searched term.

Have we ever thought how it happens. Just by entering the search terms we are displayed the highly accurate information. I have tried to mention the concept behind the Google Search Results, by comparing the Crawling and Indexing process of the Google Search Engine Bot.

There are basically three processes to present the search results to the visitors

  • Crawling the Website
  • Indexing the Website
  • Serving the Results

Website Crawling

Crawling in simple terms is the process of finding & adding the new and updated pages to the Google Index. Crawling is done by the Google's software, called as Google Bot (which is also known as the search engine spider or robot, or a bot). The Google's algorithm for crawling the site determines which sites to crawl, how many times, and how many pages from each site.

The crawler process starts with the list of the URL's, generated from the previous crawl and from the Google Sitemaps. The Googlebot visits each of these websites and it detects links on each page and adds them to its list of pages to crawl. The new sites, changes to existing sites, and dead links are noted and used to update the Google index.

It's an automated process, the crawling of the website, and has nothing to do with the payment to crawl the website more frequently.

Website Indexing

Indexing in simpler terms is mintaining list of things. In this aspect indexing means, compiling an index of words, and their location on each page, taken at the time of crawling the website by Google Bot. In addition to this, the Google Bot processes the information included in key content tags and attributes, such as Title tags and ALT attributes. Googlebot can process many content types, but not all.

The Search Results
Serving Results is the process where the Google returns the results to the query entered by the user, based on the matches with the indexed pages. The relevncy of the search result depends on a lot of factors. And based on those factors Google returns the search results.

In order for your site to rank well in search results pages, it's important to make sure that Google can crawl and index your site correctly.

