Is your site's Sitemap.xml well done?

Korea Data Forum Fosters Collaboration and Growth
Post Reply
Posts: 17
Joined: Wed Dec 04, 2024 4:38 am

Is your site's Sitemap.xml well done?

Post by shakilhasan15 »

A website's sitemap.xml is a very important ingredient for SEO : it tells Google and other search engines which URLs to index on your website and their relative importance.
Monitoring, updating and verifying the syntactic correctness of the sitemap.xml become actions to be performed periodically, depending on the frequency of updating your site, but in any case at least every two months.

Spider and Sitemap.xml

As regards the format and correct syntax of the sitemap.xml I refer you to this guide , here we see how to check with Excel and Screaming Frog (or other crawlers, since ScreamingFrog in the free version reaches a maximum of 500 urls) the completeness of your Sitemap.xml , in three simple steps:

1. Prepare data for analysis
2. Verify that all URLs in Sitemap.xml are reachable by the spider
3. Verify that all navigable pages are included in sitemap.xml

1 -Prepare the data for analysis
Let's start by scanning only the HTML pages and images of the website, so we exclude CSS and JavaScript, using ScreamingFrog enter the site you are working on go to "filter" and first select "HTML".

Screaming Frog Spider Mode
Source: ScreamingFrog

Once the process is finished, we extract the data into Excel and keep only the list of URLs and the HTTP Status Code (i.e. the server response).

The table below shows the main HTTP Status Codes and their meaning:
HTTP Status Code Meaning

Now we repeat the same process, but choosing the “images” option instead of “HTML” always through the “filter” drop-down menu.
I called the Excel sheet that we created with HTML pages and images “SCAN”.

Now we navigate to the site in question canada telegram phone number list save the Sitemap.xml file locally, if the site is managed correctly on the SEO side we should find it at, from here we right-click and select save as.

Sitemap.xml of a site

At this point we open ScreamingFrog, in the top menu we select “Mode”- “list” and load the sitemap.xml that we had previously saved.

ScreamingFrog, mode list
Source: ScreamingFrog


Once the scanning process is finished, we export the “URL” and “Status Code” columns to another sheet of the same Excel file.
I called the Excel sheet “SITEMAP”.

At this point we should have an Excel file with the sheets “SCAN” (with the site crawling results) and “SITEMAP” (with the Sitemap.xml urls).

Let's check that everything is in the right place!

2- Verify that the Sitemap.xml contains all the correct URLs
Let's go to the Excel sheet called "SITEMAP" and in the cell to the right of the status code of the first URL (It should be cell C1 if you have not put titles, otherwise C2) we insert the function search.vert setting the search on the sheet "SCAN".
The formula I use in this case is =search.vert(A1;SCAN!A:A;1;0), if we have not put titles or headers.
Post Reply