Unpacking Google’s New Robots.txt Policy Update

Google has recently made updates regarding unsupported fields within the robots.txt file. To understand this update, you must also understand what a robots.txt file is. This simple text file will tell web crawlers (or “bots”) how to interact with their site or on a specific page. This file resides in the root directory of your website and has rules about which pages should be and should not be crawled. Keep reading to learn more about the critical functions of robots.txt files and the latest Google update.

Example

This blocks bots from accessing /private/ and guides them to the XML sitemap: 
    
     User-agent: *
Disallow: /private/
Sitemap: https://www.example.com/sitemap.xml
    
   

Critical Functions of Robots.txt Files

As mentioned above, the primary function of the robots.txt file is to provide web crawlers (bots) with instructions on how to interact with a website or page. Some essential functions include: 

  • Controls Web Crawling: Website developers use the robots.txt file to specify which site areas can be crawled by search engine bots. The “disallow” directive tells search engines not to access certain website pages, files, or sections. 
  • Optimize Crawl Budget: By blocking unnecessary pages (category pages, inactive pages, etc), the robots.txt file helps search engines focus on more important pages, improving the site’s efficiency. 
  • Direct Bots to Sitemaps: Another function of robots.txt files is to help bots discover and crawl all the essential pages of your site efficiently. This is done through the XML sitemap within robots.txt, which is a file that lists the pages you would like to be discovered on the search engine results page.

The Latest Google Robots.txt Update

Google has released a new update for robots.txt files. They will now emphasize that crawlers will ignore any unsupported directives, limited to these four: user-agent, allow, disallow, and sitemap. This will ensure clarity and ensure that unsupported fields previously gone unnoticed will not influence how Google crawls and indexes your site.

Site owners should audit their robots.txt files to ensure they use only supported fields, as any unsupported fields will be disregarded. These “unsupported fields” can include custom directors or those used by third-party tools—for example, ‘crawl-delay’ or ‘archive’. These changes reinforce Google’s efforts to streamline how robots.txt files are interpreted, ensuring the same approach for all websites. 

Get Found on Google with Boston Web Marketing

We are still learning more about this update, so to ensure your website meets Google’s standards, work with our team at Boston Web Marketing. Google releases various updates throughout the year that are important to watch out for.

If you need assistance ensuring your website is found on Google and using robots.txt files correctly, work with our experts at Boston Web Marketing. We can help you get found quickly on Google—call us at 857.526.0096.

Request Your Free Website Audit Today

Revitalize your online presence.

Get started with a no-obligation SEO audit for your website. Boston Web Marketing will perform a full analysis of your website’s on-page SEO for errors and improvement opportunities. Enter your website URL to begin:

Get your free website SEO Audit

You’re just a few easy steps away from your complimentary website SEO audit.