About The Author

I am a search marketing geek. Work as APM for one of the leading companies in UK. Interested in socializing and helping others.

Get The Latest News

Sign up to receive latest news

July 06, 2008


Generally speaking, sitemaps are categorised into two,

1) An HTML Sitemap - which consists of the information for the users to know the important pages of your website, and

2) An XML Sitemap - this XML format gives information to the search engines about your website.

Sitemaps are the integral part of letting the search engine spider know what your website is about and what all it contains. Sitemaps let the spiders crawl the website through the URL's provided in the sitemap, and hence better indexing. Sitemaps can be defined as the "Structured - Page" which consists of the important pages of the website at a single shot, and pass on the information to the search engine spiders.

In a more general way, a Sitemap is an XML file that lists all the URLs for a site along with additional metadata about each URL (last modified, change frequency and priority, how important it is from other URL's in the website) to help the search engines to crawl the website more intelligently.

The basic process of indexing a website by web crawlers, is to discover the pages from the anchor texts which are within the website or from the anchor texts from other websites. Sitemaps does not gurantee that web pages are included in search engines, but allows the crawlers to go through the mentioned URLs along with the meta data, so as to make them give a better picture of your website.

Sitemaps are useful if,

~ the site consists of dynamic pages,
~ if the pages of the site cannot be easily discovered by Google Bot,
~ if the website is new,
~ the site has a large archive of content pages that are not well linked to each
other, or are not linked at all.

The Sitemap protocol format consists of XML tags. All data values in a Sitemap must be entity-escaped. The file itself must be UTF-8 encoded. The basic structure of sitemap is the following,

The Sitemap must:

-> Begin with an opening URLSET TAG and end with a closing URLSET TAG.
-> Specify the NAMESPACE (protocol standard) within the URLSET TAG.
-> Include a URL entry for each URL, as a parent XML tag.
-> Include a LOC child entry for each url parent tag.

Along with, the optional tags,

-> LASTMOD - depicting the information for last modification of the page
-> CHANGE FREQ - shows that how frequently the page will change
-> PRIORITY - gives information to the crawlers about the importance of the particular page to the other page on the website. The default priority is 0.5 and the maximum is 1.

Also, all the URLs in a Sitemap must be from a single host.

There are a few different formats of the sitemaps,

~ Video Sitemaps
~ Mobile Sitemaps
~ News Sitemaps
~ Code Search Sitemaps
~ Geo Sitemaps

Google Sitemap Generator

I found this Sitemap - Generator tool to be very useful XML Sitemap Generator