In this blog I will talk about sitemap in sitecore and I will guide
you for both creating a sitecore xml sitemap for SEO optimization and creating
an html sitemap to be used in your
website.
In this part I will provide some clarification about the
following as introduction to our following implementation:
- Sitemap?
- Robots.txt file
- Sitecore and XML files security.
- Sitemap.xml structure and tags clarification
- What a good sitecore sitemap module should have.
Sitemap?
At the beginning you need to know that search engines like Google,
ping ... etc use the sitemap xml file to
better crawl your website, better crawling and indexing allow users to find
your website pages better, this xml sitemap can only contain 50,000 URLs per
file and is limited to 10MB in size; the question here is what if your xml sitemap
file exceed that limit? Then you need to split your sitemap into more than one
xml file and create an xml index file to gather these sub files together.
Robots.txt:
A Robots.txt is defined by Google as " Is a file at the
root of your site that indicates those parts of your site you don’t want
accessed by search engine crawlers. The file uses the Robots Exclusion Standard, which is a protocol with a small set of commands
that can be used to indicate access to your site by section and by specific
kinds of web crawlers (such as mobile crawlers vs desktop crawlers).
In our case here we will show how we can use this file to tell the
search engine where to find the sitemap.xml file.
The above can be done using the following syntax:
Sitemap: {{ site.url }}/sitemap.xml
You can find more information about this file here.
Sitecore .xml files security
As you probably know sitecore prevent xml files access as security
reason just like accessing the licesene.xml so you need to add extra configuration
to handle this and it can be done by adding the following handler to allow
accessing any xml file start with the name sitemap:
<
add
verb="GET"
path="sitemap_*.xml"
type="System.Web.StaticFileHandler"name="allow xml sitemap"
/>
Sitemap.xml structure and tags clarification
Now let's talk about the sitemap.xml file structure including the
required tags and optional tags. and lets check the following as a sample for a
one entry in a sitemap.xml file:
<url>
<loc>http://www.MySite.com/AboutUs</loc>
<lastmod>
2015-08-26T07:53:49+03:00</lastmod>
<changefreq>daily</
changefreq>
<priority>0.5</
priority>
</url>
|
As you can see from the above sample the following are the tags provided:
- loc : Which represents the absolute url for the page.(Required tag)
- lastmod : Which represents the last modification data for that page ( Optional tag )
- changefreq: Which tells the search engine crawler how frequency this page is changed; which increase the chance of search engine crawler visits to this page ( Optional Tag)
- priority : Which tells the search engine how important is this page among others in your website ( Optional Tag ).
What a good sitecore sitemap module should have
Sitecore Market place has many sitemap modules that help you configure
this features with simple configuration steps but I didn't find a module cover
all of the below:
- Support multi-site.
- Support multilingual.
- Support the all optional tags mentioned above.
- Support HTML sitemap component.
- Support allow/disallow specific items appearance within sitemap.xml or component.
- Support submitting sitemap into common search engines.
In part 2 of this blog I will provide you with a detailed steps for implementing the above features for a full sitemap functionality.
No comments:
Post a Comment