Rand Fishkin along with Mike King may have published one of the biggest data leaks outside of the Department of Justice reveal around Google Search and its internal ranking features and signals. The document was from an anonymous source but verified by Rand Fishkin and contains a ton of details on how Google Search reportedly works.
More importantly, it seems to contradict a number of the Google statements made over the past two decades from numerous Google Search employees, as I covered here over the past.
I have not gone through it all yet but I felt it was important for you all to read this yourself, you can see the details at these headlines:
Rand wrote, “Many of their claims directly contradict public statements made by Googlers over the years, in particular the company’s repeated denial that click-centric user signals are employed, denial that subdomains are considered separately in rankings, denials of a sandbox for newer websites, denials that a domain’s age is collected or considered, and more.”
Mike King wrote, “I have reviewed the API reference docs and contextualized them with some other previous Google leaks and the DOJ antitrust testimony. I’m combining that with the extensive patent and whitepaper research done for my upcoming book, The Science of SEO. While there is no detail about Google’s scoring functions in the documentation I’ve reviewed, there is a wealth of information about data stored for content, links, and user interactions. There are also varying degrees of descriptions (ranging from disappointingly sparse to surprisingly revealing) of the features being manipulated and stored. You’d be tempted to broadly call these “ranking factors,” but that would be imprecise.”
Aleyda Solis has a quick summary on X where she summed up part of the leak:
- There are 14K ranking features and more in the docs
- Google has a feature they compute called “siteAuthority”
- Navboost has a specific module entirely focused on click signals representing users as voters and their clicks are stored as their votes
- Google stores which result has the longest click during the session
- Google has an attribute called hostAge that is used specifically “to sandbox fresh spam in serving time”
- One of the modules related to page quality scores features a site-level measure of views from Chrome
I have not had time to go through everything yet, I will do that over the next several days.
I have also not seen any Googler publicly comment on this yet – I know it is new and I don’t know if we will see any Googler comment on this.
This reminds me a bit like the Yandex search ranking leak.
Here are some posts on social about this – again, this has only been out for a few hours and no one but Rand and Mike had any real time to process this in super detail.
A huge thanks to @iPullRank, whom I contacted on Friday after seeing the leak, and who helped analyze and decipher much of these early findings: https://t.co/JGYdGydKlC
— Rand Fishkin (follow @randderuiter on Threads) (@randfish) May 28, 2024
Ok, let’s get this party started!
A couple weeks ago I said I was publishing the most important thing I ever wrote. I was wrong.
Documentation related to the Google Search algorithm leaked and I spent the weekend tearing it apart.https://t.co/v71B16Ggov
✌🏾
— Mic King (@iPullRank) May 28, 2024
🚨 Google Search’s Internal Engineering Documentation Has Leaked and analyzed by @iPullRank 👀 Many of these had been denied to be used by Google👇
* There are 14K ranking features and more in the docs
* Google has a feature they compute called “siteAuthority”
* Navboost has… pic.twitter.com/dlpCIQdpDm— Aleyda Solis 🕊️ (@aleyda) May 28, 2024
Until it (possibly) gets taken down by Google’s lawyers, here’s a direct link to the leaked Google ranking API docs
“google_api_content_warehouse v0.4.0”
Save these pages! https://t.co/8RgmoF69z9 pic.twitter.com/9dXobbr2U1
— Cyrus SEO (@CyrusShepard) May 28, 2024
Extremely interesting blog post by @iPullRank.
Another one of the many he writes and we save for is usefulness ⬇️ https://t.co/VZH8EARV1G— Gianluca Fiorelli (@gfiorelli1) May 28, 2024
Apparently someone at Google Search “accidentally” leaked an engineering document that reveals a ton of secrets about how the search engine works, including that they have a “Golden Document” flag which puts more weight on a document that is “Human labeled” which could mean some… pic.twitter.com/zeG79f161B
— Joe Youngblood (@YoungbloodJoe) May 28, 2024
If you want to geek out on this with me, I’ll keep updating this Google Doc for the next ~30 minutes with anything interesting before getting back to normal life.https://t.co/1iQ40nknZ0
— Glen Allsopp 👾 (@ViperChill) May 28, 2024
I am looking forward to really digging in on this.
Forum discussion at X.
Source link : Seroundtable.com