Google Documents Its Three Types Of Web Crawlers


Google has updated its Verifying Googlebot and other Google crawlers help document to add a new section describing the three categories or types of crawlers they have. They have their Googlebot crawler, special-case crawlers and user-triggered crawlers.

I believe this was done after we, including me, were obsessed a bit over the new GoogleOther crawler. Then Gary Illyes from Google added, “Please don’t overthink it, it’s really that boring.” But I do what I do and I overthinked it. So Gary did what he does and had a help document to explain this in more detail.

The help document says, “Google’s crawlers fall into three categories.”

(1) Googlebot: The main crawler for Google’s search products, it alays respects robots.txt rules. Its revenue DNS mask is “crawl-***-***-***-***.googlebot.com or geo-crawl-***-***-***-***.geo.googlebot.com” and the list of IP ranges are in this googlebot.json file.

(2) Special-case crawlers: Crawlers that perform specific functions (such as AdsBot), which may or may not respect robots.txt rules. Its revenue DNS mask is “rate-limited-proxy-***-***-***-***.google.com” and the list of IP ranges are in this special-crawlers.json file.

(3) User-triggered fetchers: Tools and product functions where the end user triggers a fetch. For example, Google Site Verifier acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules. Its revenue DNS mask is “***-***-***-***.gae.googleusercontent.com” and the list of IP ranges are in this user-triggered-fetchers.json file.

Here is a screenshot of the new section in this help document:

click for full size

Forum discussion at Twitter.





Source link : Seroundtable.com

Social media & sharing icons powered by UltimatelySocial
error

Enjoy Our Website? Please share :) Thank you!