Ranking Predict or SEO data at the leading edge of search

How can you be 92% sure that a URL will make the top 10 Google searches for a given topic – in just a few minutes? That is exactly what Ranking Predict aims to do. OVH developed this open-source SEO data tool to support the growth of both the Group and its customers.

Google search rankings are clearly a strategic matter. 91% of clicks on the world's most famous search engine are made on the first page of results. But how can you be sure of appearing in the top 10? Which SEO rules should you go by to get guaranteed good results? After all, the tech giant's algorithm is as mysterious as it is dynamic and ever-changing. According to Vincent Terrasi, Head Data Officer at OVH, "SEO has long been considered magic. In each case, there are at least 500 possible strategies. The challenge is to quickly figure out which strategy is the best one to adopt to guarantee that you'll appear on that famous first page of results." That challenge was what sparked the idea of building a model capable of predicting a ranking among the top 10 Google searches with a reliability of 92%: Ranking Predict.


Ultra-reliable search ranking prediction

Vincent Terrasi joined the OVH family two years ago, bringing with him deep experience in SEO and Customer Relationship Management (CRM). At OVH he met Rémi Bacha, a Data Scientist who has been at OVH since 2012 and is passionate about SEO. The meeting was extremely fruitful. "We're trying to respond to OVH's hypergrowth. Counting all the languages and brands we cover, we have 60 sites to manage. We really needed a tool to automate finding the best strategy to ensure the strongest possible organic search potential." Their combined approach forms the basis of Ranking Predict, which was designed internally. Rather than relying on intuition or testing each of the 500 strategies manually, the idea is to leverage machine learning to get the most reliable prediction possible. "It is impossible to achieve 100% reliability because of external factors like news items, a competitor appearing on the scene or a penalty from Google. But with a model that is 92% effective on average, we are pretty close," continues Rémi Bacha.


Figuring out the best SEO strategy in minutes

Used for more than 90% of web searches in France, Google is based on an algorithm, Google RankBrain, boosted by artificial intelligence. The ranking of a search result depends on many factors and variables, such as the field of activity. Different users have different priorities. For one user, page load time could be the most important factor. For another, it could be the terms used. So Ranking Predict makes use of different tools during several stages to estimate all these variables and figure out the best possible strategy. First, Ranxplorer is used to retrieve the top 100 keywords related to a particular topic. Each keyword is associated with the URL of the page concerned. Next, tools such as Visiblis or Majestic can be used to draw up a list of ranking factors. The machine learning element then comes into play, forming "random forest" decision trees and adjusting the correct variables to define the most appropriate final strategy.


Open source to develop together

"This tool offers a potential gain of three to five days per SEO project. With Ranking Predict, it takes just a few minutes to figure out the right strategy. In an environment as competitive and fast-changing as SEO, it really is a significant advantage," explains Vincent Terrasi. In recognition of his work in developing this tool, Terrasi was voted Search Personality of the Year at the second SEMY Awards. Ranking Predict has been designed using open-source code in order to benefit from everyone's collective intelligence – and so that everyone can benefit from the tool. "It's an approach that may appear unusual in this field, but it's justified by our desire at OVH to constantly transform and help our customers grow along with us," he emphasizes. Finally, he says, "we are fortunate that OVH placed its trust in us. Since it was launched, the tool has been widely copied. We are currently working on a second version, which will take into account even more parameters and be accessible to even more people thanks to an API."