Google’s 14,000 Search Ranking Features Leaked Through Anonymous Source


Rand Fishkin along with Mike King may have published one of the biggest data leaks outside of the Department of Justice reveal around Google Search and its internal ranking features and signals. The document was from an anonymous source but verified by Rand Fishkin and contains a ton of details on how Google Search reportedly works.

More importantly, it seems to contradict a number of the Google statements made over the past two decades from numerous Google Search employees, as I covered here over the past.

I have not gone through it all yet but I felt it was important for you all to read this yourself, you can see the details at these headlines:

Rand wrote, “Many of their claims directly contradict public statements made by Googlers over the years, in particular the company’s repeated denial that click-centric user signals are employed, denial that subdomains are considered separately in rankings, denials of a sandbox for newer websites, denials that a domain’s age is collected or considered, and more.”

Mike King wrote, “I have reviewed the API reference docs and contextualized them with some other previous Google leaks and the DOJ antitrust testimony. I’m combining that with the extensive patent and whitepaper research done for my upcoming book, The Science of SEO. While there is no detail about Google’s scoring functions in the documentation I’ve reviewed, there is a wealth of information about data stored for content, links, and user interactions. There are also varying degrees of descriptions (ranging from disappointingly sparse to surprisingly revealing) of the features being manipulated and stored. You’d be tempted to broadly call these “ranking factors,” but that would be imprecise.”

Aleyda Solis has a quick summary on X where she summed up part of the leak:

  • There are 14K ranking features and more in the docs
  • Google has a feature they compute called “siteAuthority”
  • Navboost has a specific module entirely focused on click signals representing users as voters and their clicks are stored as their votes
  • Google stores which result has the longest click during the session
  • Google has an attribute called hostAge that is used specifically “to sandbox fresh spam in serving time”
  • One of the modules related to page quality scores features a site-level measure of views from Chrome

I have not had time to go through everything yet, I will do that over the next several days.

I have also not seen any Googler publicly comment on this yet – I know it is new and I don’t know if we will see any Googler comment on this.

This reminds me a bit like the Yandex search ranking leak.

Here are some posts on social about this – again, this has only been out for a few hours and no one but Rand and Mike had any real time to process this in super detail.

I am looking forward to really digging in on this.

Forum discussion at X.





Source link : Seroundtable.com

Social media & sharing icons powered by UltimatelySocial
error

Enjoy Our Website? Please share :) Thank you!