A comprehensive guide to the Yandex leak: what do content writers need to know?
Yandex is a Russian multinational tech company, and the fourth-largest search engine in the world. On Jan 26, 2023, it experienced what is probably one of the largest data leaks that a modern tech company has suffered in years, when its search ranking factors were released to the public.
The leak revealed more than 20K ranking factors Yandex uses — or used at some point — in its search algorithm. Lots of people are talking about 1,922 factors, but these were factors that belonged to a single file that was shared by Martin MacDonald. The complete archive included way more findings that were compiled and turned into a pretty cool explorer by Rob Ousbey.
Why this is important for copy and content writers who care about SEO
Many software engineers have worked for both Google and Yandex. They go to the same conferences to share ideas, discoveries, and projections. Yandex has a presence in Palo Alto and Google used to have one in Moscow. Yandex was built as a Google analog. According to this thread by Alex Buraks, Google and Yandex search queries have a 70% overlap. Yandex uses Google’s open-source technologies that have been fundamental for innovations in the Search industry, such as TensorFlow, BERT, and MapReduce. Yandex has its own version of Google Analytics called Metrika that, as we will see, affects ranking positions.
While Yandex is not Google, it’s not some amateur random project. By taking a look into this codebase, we can learn a lot about how one of the most important search engines in the world is built. Therefore, we can up the game with our SEO strategy.
If you clicked on the links, you found pieces of code that will probably make lots of sense –IF you’re an SEO-friendly programmer. But if you want to understand without getting lost in the matrix, we’ve compiled what we consider the best interpretations from different authors that analyzed the code.
We went so far into the rabbit hole we even found some programmer jokes inside the code:
(Screenshot taken from Michael King’s Twitter thread)
Most relevant ranking factors
We want to give special thanks to Alex Buraks, Michael King, Sean Si, and Dan Taylor who wrote some excruciatingly precise threads and articles. We weaved together their findings to create this comprehensive list.
Ranking Factor #1 - URL construction matters
Forget numbers or too many slashes on your URL. Also, avoid complex URLs. When the query matches the words on the URL or at least a fraction of it, the site ranks better. The optimal is including 3 words from the search query.
Ranking Factor #2 - Host reliability
Low-quality hosts will make you rank lower.
Ranking Factor #3 - User behavior
No one knows the exact parameters, but a high CTR, a low bounce rate, and longer permanence of your users on the site make you rank better. If your page is the last one the user visits during the search (meaning the user got the info they needed) you rank better. SEO expert Michael King says this is one of the heaviest factors for Yandex's ranking.
Ranking Factor #4 - Pages’ Date
If your page is new or has been recently updated, it ranks better. If your document is older than 10 years or has no date, it will rank poorly. Date metadata is important, so if it’s missing from your content, it can lower your rankings.
Ranking Factor #5 - Crawl depth
Keep your important pages closer to the main page. Top pages 1 click away. Important pages should be available less than 3 clicks away.
Ranking Factor #6 - Backlink relevance
Backlinks on the main page are more important than those on internal pages.
Ranking Factor #7 - Number of search queries for your site/URL
When people are searching for your URL they give your pages a better ranking. This is where brand awareness campaigns (and having your brand on your URL) gain relevance.
Ranking Factor #8 - Wikipedia pages
Yandex has a special formula for Wikipedia pages that makes them rank better.
Ranking Factor #9 - Bookmarked pages
The more your page has been bookmarked, the better it ranks.
Having a map embedded on your page to show location gives you a higher score for ranking. (Ex: Google Maps).
Ranking Factor #11 - Broken videos
If you have a broken embedded video you rank down.
Ranking Factor #12 - Verified social media
Links for verified social media accounts rank up.
Ranking Factor #13 - Link anchors
Backlink text anchors need to contain all words from the keyword. For example, if you’re aiming for “boho living room decor” as a keyword for a certain page, try to use the complete keyword as a hyperlink to another site you want to link to. You’ll lose ranking positions if your commercial anchor links go over 50%. By commercial anchor links, we are talking about the hyperlinked text that contains search terms that refer to a product or service (aka: high intent keywords).
Ranking Factor #14 - Quality backlinks
Not a surprise here, but the usage of quality backlinks make your page rank better. What they mean by “quality” is yet to be defined, but it’s probably referring to highly-ranked pages.
Ranking Factor #15 - Quality content
Pages with low-quality content affect ranking. By quality content we mean, useful content that matches the title, doesn’t plagiarize, and is long enough to give the reader some good information. Yandex leaked code had a specific factor that mentioned word count.
Ranking Factor #16 - Ads on page
Having too many ads and banners on your page, or abusing the use of PPC, will make you rank poorly.
Ranking Factor #17 - .com domains rank better
Wondering if you should buy the .com or something else? Well, according to Yandex, .com is better.
Ranking Factor #18 - Google Analytics
Ranking Factor #19 - Geolocation
Websites that match the location of the query will rank better.
Ranking Factor #20 - Orphan pages
Avoid making any important page unreachable from your homepage. If it’s an orphan page, it will be dead for the search engine brain.
Ranking Factor #21 - Diversify your traffic
Having other traffic sources like social media, direct search, and PPC is good for your site. Yandex uses this to find out if your website is real, and not just some spammy SEO project.
Ranking Factor #22 - Returning visitors
Having users that return to your website gives you a better position. Email marketing gets pretty useful here.
Ranking Factor #23 - Server errors
Server errors and excessive 4xx errors can impact ranking.
Ranking Factor #24 - Time of query
The time of day is taken into consideration as a ranking factor. This makes sense especially for places with a fixed schedule, for example, a restaurant. Yandex will recommend an open one rather than a closed one.
Ranking Factor #25 - Meta Keywords
Google stopped using meta keywords as a factor in 2009, but Yandex seems to keep using them as a factor.
One step closer to understanding search engines
In the end, it’s not about taking these recommendations and going crazy with your website’s SEO strategy. It’s more about understanding how a modern search engine is built nowadays. You can extract these factors and add what makes sense to your little (or huge) SEO workbook. Try them out on some pages and see how it goes. Trial and error: a writer’s bread and butter.