-
Notifications
You must be signed in to change notification settings - Fork 0
Lesson 2A: Show a search engine the direction with an XML sitemap
Finding something is much easier if you know where to go. Use XML-sitemaps to direct search engines to pages or data objects which are available for crawling.
XML-sitemaps are an easy way to inform search engines about pages or data object you want to be crawled. An XML-sitemaps lists URLS and additional metadata about each URL.
Short indexation period, and periodical re-crawling (based on indicated indexation frequency).
Possible approach
Create an XML-sitemap for each URL (<uri>
» <loc>
) you want to be crawled.
Add additional metadata to each URL:
-
<lastmod>
: denotes when the data is last updated here. (for spatial data, put the date it is collected / measured). -
<changefreq>
: denotes how often the data is updated. -
<priority>
: priority related to other URL's in the sitemap
Example:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://geo4web.apiwise.nl/gemeente/GM0307</loc>
<lastmod>2016-01-01</lastmod>
<changefreq>yearly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
See also: sitemaps.org
However, XML-Sitemaps can be come very large for large (spatial) datasets.
A sitemap has a max size of 2500 web pages, the sitemap specification however supports the concept of pagination. In the previous version of the specification, a sitemap allowed to contain a spatial extent for a resource, but this functionality is not supported operational anymore.
The second phase of the testbed suggests that, even though a sitemap is important, a human-readable "datamap" is at least as important, so that developers understand which data is available when they reach the site.
How to test
After registering an XML sitemap at Google or BING, from that point in time the crawling and indexation status is monitored on Google and Bing Webmaster tools.
Some examples of XML sitemaps for spatial datasets can be found here: