Google updated their Search Central Documentation to on verifying Googlebot, adding documentation about user-triggered bot visits, information that was missing from previous Googlebot documentation, which has created confusion for many years, with some publishers blocking the IP ranges of the legitimate visits.
Newly Updated Bot Documentation
Google added a new documentation that categorizes the three different kinds of bots that publishers should expect.
These are the three categories of Google Bots:
- Googlebot – Search crawler
- Special-case crawlers
- User-triggered fetchers (GoogleUserContent)
That last one, GoogleUserContent is one that’s confused publishers for a long time because Google didn’t have any documentation about it.
This is what Google says about GoogleUserContent:
“User-triggered fetchers
Tools and product functions where the end user triggers a fetch.
For example, Google Site Verifier acts on the request of a user.
Because the fetch was requested by a user, these fetchers ignore robots.txt rules.”
The documentation states that the reverse DNS mask will show the following domain:
“***-***-***-***.gae.googleusercontent.com”
In the past, what I was told by some in the SEO community, is that bot activity from IP addresses associated with GoogleUserContent.com was triggered when a user viewed a website through a translate function that used to be in the search results, a feature that no longer exists in Google’s SERPs.
I don’t know if that’s true or not. It was enough to know that it was a visit from Google, triggered by users.
Google’s new documentation explains that bot activity from IP addresses associated with GoogleUserContent.com can be triggered by the Google Site Verifier tool.
But Google doesn’t say what else might trigger a bot from the GoogleUserContent.com IP addresses.
The other change in the documentation is a reference to googleusercontent.com in the context of IP addresses that are assigned to the domain name, GoogleUserContent.com.
This is the new text:
“Verify that the domain name is either googlebot.com, google.com, or googleusercontent.com.”
Another new addition is the following text which was expanded from the old page:
“Alternatively, you can identify Googlebot by IP address by matching the crawler’s IP address to the lists of Google crawlers’ and fetchers’ IP ranges:
Googlebot
Special crawlers like AdsBot
User triggered fetches”
Google Bot Identification Documentation
The new documentation finally has something about bots that use IP addresses that are associated with GoogleUserContent.
Search Marketers were confused by those IP addresses and assumed that those bots were spam.
A Google Search Console Help discussion from 2020 shows how confused people were about activity associated with GoogleUserContent.
Many in that discussion rightly concluded that it was not Googlebot but then mistakenly concluded that it was a fake bot pretending to be Google.
A user posted:
“The behaviour I see coming from these addresses is very close (if not identical) to legitimate Googlebot behaviour, and it hits multiple sites of ours.
…If it isn’t – then this seems to indicate there is widespread malicious bot activity by someone trying quite hard to look like Google on our sites which is concerning.”
After several responses the person who started the discussion concludes that the GoogleUserContent activity was spam.
They wrote:
“…The Googlebots in question do mimic the official User-Agents, but as it stands the evidence seems to point to them being fake.
I’ll block them for now.”
Now we know that bot activity from IPs associated with GoogleUserContent are not spam or hacker bots.
They really are from Google. Publishers who are currently blocking IP addresses associated with GoogleUserContent should probably unblock them.
The current list of User Triggered Fetcher IP addresses is available here.
Read Google’s updated documentation:
Verifying Googlebot and other Google crawlers
Featured image by Shutterstock/Asier Romero
Source link : Searchenginejournal.com