Google has always said that having pages that serve a 404 status code is normal; most sites have them, and when a page does not exist on your site, and someone tries to access it, a 404 is the proper response. But Gary Illyes from Google is now sharing some cases when you should fix pages returning a 404 status code.
Gary posted this on LinkedIn and started off with the typical Google Search disclaimer, “404 (Not found) errors are not to be afraid of and you don’t need to scramble to fix them, at least not most of the time.”
Then Gary went into some of the times you might want to fix them…
Gary first explained what a 404 status code is, he said, “A HTTP 404 status code is for cases when a URL on your server is not mapped to a resource, so from your perspective it can be one of these two buckets: the URL SHOULD return content and a 200 status code, or the URL was indeed not supposed to return content. This second bucket could be split further, specifically URLs that could be useful to users and URLs that are absolutely useless.”
He then shared two or so cases when a 404 (page not found) should return a 200 (all good) status code):
1. the URL SHOULD return content and a 200 status code. For example, you accidentally deleted the HTML mapped to the URL, or you messed up something with your database.
You should fix these as soon as possible, especially if the URL is important to your users and thus site.2. the URL was indeed not supposed to return content, which can be either:
a) the URL COULD be useful to users. You should probably think about mapping these URLs somehow to a piece of content on your site by eg. redirecting. Some cases I’ve seen that fall into this category are broken links from high user-traffic pages; the users tap on the link, they find a 404 error even though you have the perfect content for them.
b) the URL is absolutely useless. From a user’s perspective, there’s nothing you should do about these. If you do, you just mislead them. Some cases I’ve seen that fall into this category is off-site links to content that you don’t have (say you changed business and you don’t sell surströmming anymore).
Gary summed it up by adding, “Unconventional as it may be, you don’t need to fix all 404 errors: fix those that actually will help users.”
Some Other 404 Answers
In the thread, Gary answered more questions on how Google handles 404s:
Pierre Paqueton asked, “whenever Google sees a page is in 404, does it store the information somewhere so that it doesn’t use useful crawling (and my server) resources by crawling it? If I can make sure I remove all 404s from my website, it might be a bit harder to tell all websites linking to these pages which no longer exist to also remove their links.”
Gary Illyes answered, “I don’t remember the number and too lazy to look it up, but after a few tries we give up on 404 URLs and won’t retry them until we see new (as in newly created) links to them. Or at least that’s how I remember it works.”
Evgeniy Orlov asked, “404 generate crawling issues – gbot comes to recrawl them. Why do you not recommend (in this post) to turn 404 to 410?”
Gary Illyes answered, “because we treat them the same, so 404 create as much crawling issues as 410. people misuse status codes and we need to be flexible. PS: I don’t actually know what you mean by “crawling issues” in your comment.”
Jimmy Hartill asked, “Surely there’s an issue of trust for users with 2b)? If you follow a link to a site and its a 404, it doesn’t give an impression that you’re trustworthy – even if you don’t sell surströmming any more there ought to be something at the end of the link, even if it’s just a link to what you’re selling in place of it or a resource explaining you’ve changed business. Without throwing stones in a glass house, I’d imagine having any 404’s is detrimental to users just because they’re never *trying* to reach a 404?”
Gary Illyes answered, “if it answers their query then it’s 2a) though, right?”
Forum discussion at LinkedIn.
Source link : Seroundtable.com