The ultimate guide to robots.txt

June 27, 2023 by BTR

The robots.txt file is one of the main ways of telling a search engine where it can and can’t go on your website. All major search engines support its basic functionality, but some respond to additional rules, which can be helpful too. This guide covers all the ways to use robots.txt on your website.

Warning!

Any mistakes you make in your robots.txt can seriously harm your site, so read and understand this article before diving in.

Search engine	Field	User-agent
Baidu	General	`baiduspider`
Baidu	Images	`baiduspider-image`
Baidu	Mobile	`baiduspider-mobile`
Baidu	News	`baiduspider-news`
Baidu	Video	`baiduspider-video`
Bing	General	`bingbot`
Bing	General	`msnbot`
Bing	Images & Video	`msnbot-media`
Bing	Ads	`adidxbot`
Google	General	`Googlebot`
Google	Images	`Googlebot-Image`
Google	Mobile	`Googlebot-Mobile`
Google	News	`Googlebot-News`
Google	Video	`Googlebot-Video`
Google	Ecommerce	`Storebot-Google`
Google	AdSense	`Mediapartners-Google`
Google	AdWords	`AdsBot-Google`
Yahoo!	General	`slurp`
Yandex	General	`yandex`

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

The ultimate guide to robots.txt

Warning!

Table of contents

What is a robots.txt file?

What does the robots.txt file do?

Where should I put my robots.txt file?

Yoast SEO and robots.txt

Pros and cons of using robots.txt

Pro: managing crawl budget

A note on blocking query parameters

Con: not removing a page from search results

Con: not spreading link value

Robots.txt syntax

The user-agent directive

The most common user agents for search engine spiders

The disallow directive

How to use wildcards/regular expressions

Non-standard robots.txt crawl directives

The allow directive

The crawl-delay directive

The sitemap directive for XML Sitemaps

Don’t block CSS and JS files in robots.txt

Test and fix in Google Search Console

Validate your robots.txt

Behind the scenes of a robots.txt parser

Enjoy Our Website? Please share :) Thank you!

Warning!

Table of contents

What is a robots.txt file?

Crawl directives

What does the robots.txt file do?

Caching

Where should I put my robots.txt file?

Yoast SEO and robots.txt

Pros and cons of using robots.txt

Pro: managing crawl budget

A note on blocking query parameters

Con: not removing a page from search results

Noindex directives

Con: not spreading link value

Robots.txt syntax

WordPress robots.txt

The user-agent directive

The most common user agents for search engine spiders

The disallow directive

How to use wildcards/regular expressions

Non-standard robots.txt crawl directives

The allow directive

The crawl-delay directive

The sitemap directive for XML Sitemaps

Don’t block CSS and JS files in robots.txt

Test and fix in Google Search Console

Validate your robots.txt

Behind the scenes of a robots.txt parser

Enjoy Our Website? Please share :) Thank you!