Online Marketing Services from Intelligent Marketing...
  Homepage About Us  Our Services  Case Studies  Marketing Guides Contact Us

    

Guide to Sitemaps

Search Bots | Types | Layout and Formats | Creating a Sitemap

Introduction - sitemaps provide a simple way for webmasters to allow search engine crawlers or 'searchbots' to find out the names of all the pages (URL's), a webmaster would like to submit (or re-submit) to the search engine.    A sitemap is basically a list of URL's and is one of the first pages search engines will automatically visit to speed up the 'caching' or copying process. 

 

The Role of Search Bots - in the past, search engines relied on the bot following all available links from every page to every other page... some of these pages contained noindex, nofollow instructions and so this was recognised as an inefficient crawling methodology.  So sitemaps were created to provide an appropriate and central single list of pages to crawl.  As new pages are added by webmasters, it is vital that the sitemap files on a website are also updated to reflect the increase in content to cache.   Search bots may visit several times a day depending upon the frequency a website is updated.  The more it is updated, the more likely a search bot will return to the sitemaps to find out the locations of the new files to cache.  The faster pages are cached in a search engine, the more chance visitors will find that new page sooner from a search and hence visit the site.   Sitemaps are also useful for humans to locate a specific piece of information of page about a topic.  This is particularly useful when searching a very large site or hundreds or thousands of pages to trawl through.

 

Types of Sitemap Files - historically there are 4 basic sitemap files that search bots will cache and follow the links contained in them;  The human static one, an XML based sitemap, urllist.txt (which is historically Yahoo specific) and robots.txt (which was originally designed to provide additional exclusion instructions to the bots).  The main industry standard is now XML because of the flexibility, speed and convenience of using this data format to collect url references.  This XML standard (Sitemap 0.9 protocol) format has been agreed by Google, MSN and Yahoo (the major search engines).  The XML sitemap not just includes the list of url's on a website but also when they were added, how often they each page is updated and changed and when a search engine should return to re-cache the page.   Of course the search bots are owned by search engine and the algorithmic criteria used by them may mean it ignores the instructions of a sitemap and not return until it is ready to.  Our XML sitemap for this site is here.

 

Format the XML Sitemap - the following shows an example of this proper way a xml sitemap is displayed.  It only contains 1 URL and uses optional tags (shown in italics):-

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>

Creating an XML Sitemap - there are many free and paid tools for automatically creating a sitemap in minutes; so there is no point manually created one each time (especially when you need to regularly update and crawl your own site to make sure its up to date).  We have used xml-sitemaps.com to create our sitemap on this site - its a simple to set up script generates any kind of sitemap you require: XML, Text, HTML site maps. It is developed in PHP languages and works with most web-server's configurations.

   

  Bookmark this Page     

  Print this Page    

  Email this Page


  Online Marketing Guides...