We ran the largest hreflang study ever, nearly 10X larger than any other study. In total, we looked at issues on 374,756 different domains that used hreflang tags. Our findings show that 67% of them have at least one issue.
Let’s look at the most common issues you should actually care about.
Setting an x-default is not required. But it is recommended if you need a fallback page for users whose language settings don’t match any of your localized versions.
Hreflang works by the most specific match. Language+country is more specific than just language, which is more specific than x-default. X-default mostly serves as a backup or global default page, where you want to send people.
Self-referencing hreflang tags are included in the guidelines. But they’re really more like a best practice and not actually required.
In the old days of hreflang, before the systems and plugins handled it, having a missing self-referencing tag meant that when you copied the tags to other pages, at least one of the connections would be broken. This is less likely to happen on modern websites, so it’s not as big of an issue.
If you link to an incorrect URL, then the tags are broken and pages can’t swap properly in the search results. They work in pairs to form a cluster of pages. This is what an hreflang cluster looks like.
If the broken links are temporary while you’re still setting up pages, it’s OK to leave them. If these broken pages don’t exist and you don’t plan to have them, it doesn’t really hurt anything—but you may want to remove the references anyway.
Redirected pages included in hreflang tags are OK only if you have an auto-redirecting global version of the homepage.
There is an approved setup for homepages only that uses a 302 redirect for dynamic redirects based on location and language settings. I see people try to change this all the time, but it’s a documented setup that has been recommended and working on many sites for years.
In all other situations, a redirected page referenced in hreflang tags will mean that something is broken.
As I mentioned, hreflang tags work in pairs. If both pages don’t reference each other, they can’t establish the connection and swap properly in the search results.
This is especially important when you have multiple versions of a page in the same language. You may end up sending the user to a version of the page for the wrong country.
Hreflang is one of many canonicalization signals that Google uses to determine which version of a duplicate page it should index. In many cases I’ve looked at, the canonical tag was ignored in favor of the URL specified in hreflang.
However, this is just a signal like many others and can be ignored, so it may work differently.
Hreflang requires two-letter language codes (ISO 639-1) and two-letter country codes (ISO 3166-1).
Some of the common incorrect values are people using the country code instead of the language code, typos, trying to use region codes when they aren’t supported, or trying to use three-letter codes instead of two-letter ones.
Some people just use codes that are wrong as well. For example, they use things like “la” for Latin America, but that doesn’t work. Another common one is “uk” when they should use “gb.” But the funny thing here is that “uk” is a specially reserved code, and Google actually accepts this one!
This issue shows pages with different language codes declared in the HTML language attribute and hreflang annotation for the URL.
These are different systems, but both are used to say what language the page is in. If they don’t match, something is fishy and you should check which language the page is actually in.
For an hreflang language or language and country combination, you should only have one page specified for each unique value. If you specify “en” for a page and use “en” again but say it’s a different page, then Google is going to have to choose one or the other. They can’t both be the correct version.
While this sometimes happens in the code of the page, it’s often a mismatch between the code of the page and sitemaps. Ahrefs’ Site Audit looks at all the supported hreflang implementation locations, including the <head>, HTTP header, and sitemaps.
In this case, pages were referenced for more than one language in hreflang annotations. For example, you may see this issue if you reference the page in an hreflang tag that specifies the page is for English and another hreflang tag that says it’s for Spanish.
You shouldn’t have two languages on the same page, so check which one is correct and remove the other one.
Final thoughts
A huge thanks and shoutout to my colleague, Oleksiy Golvoko, for helping me gather this data! I’m surprised the numbers weren’t worse in the study, but I suspect that a lot of these sites have basic implementations.
Hreflang is complex and hard to get right. It can break in so many different ways. Here’s what Google’s John Mueller has to say about it.
Want to see if your site has hreflang issues? Run it through Site Audit or try it for free with Ahrefs Webmaster Tools.
Hreflang is a topic I’m passionate about and one that I’ve written and presented many times, so I was happy to write this up. One of the first blog posts I made edits to when I joined Ahrefs was our hreflang guide. I’d recommend that if you want to learn more about hreflang and some of the nuances of it.
If you have questions, message me on Twitter.
Source link : Ahrefs.com