URL Pruning in Google’s index – an SEO strategy

In this article we will go through a guided process of removing low-value webpages from your website and the Google index.

This exercise is a part of the collective SEO strategy to improve one’s website and convert the traffic leads to sales. This is an essential step in the content audit of the website. During the audit, you may find a number of posts that are not bringing any significant traffic, backlinks or generating any buzz on the social media, but maybe generating new readers for the business partnerships. Such pages may have to be dealt separately.

After a thorough content audit of your website, define the tasks according to priority. There may many issues like, missing/incorrect canonical tags, faulty redirects, mixed http and https pages etc. Or there may be fewer issues all depending on the health of your website and its contents. Either way, prioritize the tasks to clean up.

Now, let’s begin with the step-wise guide to removal of multiple pages from the website.

1.Using the ‘noindex’ tag

Use a ‘noindex’ meta tag in the HTML of your page. This will prevent Google from showing that page in the search results, even if it is linked to another website.

A point to remember when using this tag is that, this step will not prevent Google from crawling the page. The Googlebot will crawl the page and check the header or tag and drop the page from the results when it comes across the ‘noindex’ tag.

Also, for this step to be effective the webpage should not be blocked by the robot.txt file. The Googlebot will not be able to crawl and hence will not encounter the ‘noindex’ tag, which will result in showing this webpage in the search results.

2.Make a Sitemap for the pages

To make sure that Google knows about all your webpages, you can create a sitemap for the pages that you want to remove. This will also give you a fair number of multiple pages you have to remove.

Before removing these pages, you can categorize them according to their content and submit to the Google Search Console. This step is not only helping Google to find the pages on your website, but also you to know the flow of your required webpages.

3.Check the internal links and page values

Always check for the internal links before removing the webpages, this will ensure there are no broken links. This step is eminent to the webpages removal process, as it involves the performance of the whole website.

The structure of a website is going to change immensely with the removal of so many webpages. This will require a thorough internal links checks and modifications.

You can use some tools that are available to analyze if there is a pattern and check for internal and external links for your webpages. Check and fix links from the pages that would generate significant amount of traffic to relevant pages, so this won’t be missed out.

4.Make the listing and index pages

Make a listing page with all the URLs you want to remove from the website. Limit the number of URLs to a few hundreds or thousands so that you can verify if they are being crawled by the GoogleBot.

The listing can be submitted to Google Search Console. Using the ‘noindex’ or ‘follow’ meta tag will ensure that those URLs do not appear in the Google search results.

If there are many URLs fir you to remove, then the number of listing pages will increase since the number of URLs in one page will be limited.

The naming convention followed in making the listing and index pages will determine the ease of analysis. Use a standard naming convention that can be easily recognized.

5.Removing the pages

Make sure that the pages listed in the sitemap are visible. Refer to the Index Coverage Status report to check if all pages are visible and determine the right time to remove the pages.

Before the removal of your pages, make sure that all the above steps are followed and completed.

Define the page deleting with either the Status Code 404 or Status Code 410.

6.Check if the action is completed

After removing all the internal links to the removed pages and retaining the pages with good traffic/conversions, it is time to check the sitemaps.

Using the “Request indexing” in the Google Search Console, you can request the indexing and choose the ‘Crawl this URL and its direct links’ submit method. Ensure that all the links are crawled by Google.

7.The Server logs and Sitemap

This is the time to analyse if all the actions were completed.

Analyse the server log files and ensure if the GoogleBot has crawled all the index and listing pages, and the assigned status code is accurately returned.

If you have made all the previous steps clear and concise and easy to follow, then there will be fewer chances of error in your log files.

Also, check the sitemap to determine the visibility trend and the decline of excluded pages.

8.Sitemap, Index and Listing files

This is a crucial last step that will bring the process to an end and bear desired results if all the previous steps are followed accurately.

Upon close monitoring of your website and log files, make sure that the removed pages are no longer in the index. Once you are completely sure wipe out the sitemap, index and listing files.

The removal of pages from the index will vary for each website. Wait until the end of this process and only then wipe out your sitemap and listing files.


At the end of this process, there will be only high-value webpages left in your website. Some of the changes would be either to see increase in traffic to your website or decrease. If there are no changes in the volume of the traffic, then it means that you have removed the meaningless webpages from your website. So, no harm done.

The next step would be to concentrate on content quality and to follow a meaningful strategy to improve your website and hence the overall business.


