Anti-DDoS Game: Protection that Thwarts Attacks to the Online Gaming Universe

Today, performing a denial of service attack is amazingly simple. It is no longer necessary to possess advanced technical skills to render a service unstable or unavailable. Anti-DDoS protection, like that developed in 2013 by OVH, make it possible to limit the scope of such malicious acts which are in sharp increase (+80% recorded by OVH between 2013 and 2014). The gaming and e-sports industries are particularly affected by this phenomenon and the protective measures implemented by service providers often show their limits against the intensity and frequency of these attacks, in particular those which exploit the “non-connected” mode of the UDP protocol, the networking protocol used by a lot of game and voice servers. This is why OVH developed specific anti-DDoS protection adapted to the gaming universe. Clément Sciascia, OVH developer, provides more details.

Why do attacks on gaming servers constitute a specific problem?

Attacks targeting gaming servers are more frequent than those mounted against other types of servers. A disgruntled player, a teenager who thinks he or she is a hacker or playing the part of blackmailer, an unscrupulous competitor… the pretexts of attacks are numerous and are often futile. If attacks are increasing, it is because they are becoming easier than ever to carry out. Tutorials and scripts are easily found on the Internet, and for a few dollars, an army of cloud servers can be rented in order to bombard a targeted server with requests. Some sites offer to provide “DDoS” as a service. They claim to offer “load testing” services, but it is clearly understood how such a service could be misused.
Game and voice servers are especially sensitive and any lag caused by server overload or bandwidth saturation affects players. This results in slowed game play or in the case of voice servers, conversations that are choppy or simply lost.
VAC is the three step process used by the “classic” anti-DDoS system. The process includes: 1. packet analysis, 2. “vacuuming” of incoming traffic in the case of an attack, 3. mitigation (the filtering of non-legitimate packets). When the “vacuum” is active, there is a slight delay in relation to the start of an attack. We are speaking about a delay of only a few seconds, which is not noticeable when running ordinary applications, but for game servers, it creates a disruption. In addition, numerous game and voice communication services, like Teamspeak or Mumble (services that players of the same team use to speak to one another during game play), make use of the UDP protocol for communication. The main interest in this protocol, also used by streaming services, is the speed in which it transmits small data packets while consuming little resources. The UDP protocol, unlike TCP, operates without negotiation; there is no handshake, between the servers exchanging information, prior to data transmission. When players join a game, connection authorization is managed at the application layer (L7). As a consequence, this makes it difficult to distinguish between packets sent by an authorized IP from those sent by a spoofed IP.
This is to say, mitigation is complicated because illegitimate packets have the same appearance as legitimate packets. For example, in the case of an attack called “Source Engine Query”, an issue for Counter Strike servers, the attack consists of overloading the server with query requests, with the goal to retrieve information about the state of the server. If these types of packets are filtered without discernment, players (real ones) would not be able to view the server or receive its information.
VAC, which effectively manages a variety of attacks, encounters difficulties when faced with this type of attack. This is why we had to innovate. For example, intervention is also required, at the level where traffic exits the machines. In the case of the “Source Engine Query” attack, the countermeasure consists of storing the server’s response in cache, to be able to respond in case of a flood, so that illegitimate requests cannot exhaust server resources.

Exactly, what is the difference between anti-DDoS Game protection and classic anti-DDoS protection? What mechanisms have you put in place and how did you get here?

The first step of our work, which lasted a total of more than six months, was to establish a list of games and voice communication services based on two criteria: commercial success and ability to “attract” DDoS attacks. VeryGames, one of our customers specializing in hosting services related to video games, explained to us that there are very popular games that receive very little attacks. An example is Farming Simulator, with its players being on average older than those of Minecraft players. Within our lab, we installed a selection of games on laptops and connected them to servers to analyze network packets. This allowed us to envision the different possible attack strategies for each game. Initially, it was easier for us to use reverse engineering than to contact the software developers of each game. For a passionate online gamer, like myself, it was a bit frustrating to play games in the name of R&D. Our interest was only in the connection phase between the player and the server because it is at this level which attacks should be detected and treated.
Next, we imagined building an infrastructure to compliment the “classic” anti-DDoS (the VAC), an infrastructure that would permit us to analyze not only incoming traffic but would also allow us to keep an eye on outgoing traffic as well (which is not the case with VAC). Therefore, mitigation is bidirectional (incoming/outgoing) and another difference is that it is constantly active, enabling reaction to the first packets of an attack. The goal was to have the server remain “playable” throughout the duration of DDoS attacks and even better, assure that players would not be aware of any malicious activity.

A Tilera box located next to the machine for packet analysis. A specific treatment is applied for each game hosted.

We can see on the schematic, Tilera is a box, situated close to the server, which inspects TCP/IP and UDP packets, initiates mitigation and can cache to lighten the load on the machine under attack when it is difficult to filter illegitimate packets from legitimate packets. In the case of “classic” attacks, that is to say when the VAC knows how to mitigate, Tilera assures protection just for the time that the VAC is active and takes over. In addition, the Tilera box is situated close to the server (at the same level as the switches) in order to provide effective protection even when an attack comes from within the OVH network. Mitigation filters attacks originating from machines situated within OVH until they can be identified and suspended.
Given the load that it is faced with, Tilera equipment was chosen for its calculating power. Several thousands of packets per second are screened using particularly complex algorithms, all at very high speed. The difference from the Arbor solution, which Tilera is paired with in VAC, is that Tilera equipment is delivered without software, logical analysis of traffic needs to be fully developed. Implantation of the mitigation code (the algorithms) is based on information collected during the reverse engineering phase. It was not possible to develop a universal mitigation code. For each large family of games (Counter Strike, Minecraft…), we have developed a “profile”, or a set of predefined rules that users can deploy in a click on the Tilera box, (via the customer control panel) to filter, with the greatest possible accuracy, illegitimate inbound and outbound traffic of a server.

For each family of game (Counter Strike, [url="https://www.ovh.com/ca/en/dedicated-servers/game/minecraft.xml"]Minecraft[/url]...), we have developed a “profile” that users can deploy in a click on the Tilera box, to filter, with the greatest possible accuracy, illegitimate inbound and outbound traffic of a server.

Is the protection that you put in place unique? What results did you achieve?

Once the solution was deployed, we tested it. First, it was tested internally then in the form of a beta test open to the public, during which fifty machines could be rented for a maximum duration of fifteen days. The goal was to increase the number of games tested to prove the effectiveness of the anti-DDoS protection, fix any weaknesses and correct algorithms to eradicate false positives. We learned that Counter Strike and Global Offensive have a connection protocol which varies in function depending on the method of connection used by players (via a browser, joining a friend by direct connection…). What a real headache!
In June 2015, satisfying results were obtained, allowing for seriously consideration of putting a Game range of servers on the market with this protection included.
However, our work did not stop there. We always have an eye fixed on attacks and study very carefully the ones which we have not listed yet. Some are a result of an administrator’s incorrect server configuration, which are false positives. Others are real attacks which allow us to continue to optimize the proposed protection by integrating algorithms to counter against such activity. We must remain modest, we’re playing cat and mouse with those who launch attacks or those, luckily less in number, which attempt to pass through our protection, by successively trying to develop new methods of attack. We will never find a universal solution to counter all attacks but the important thing is that we keep ahead enough to anticipate the attacks of tomorrow. For this reason, it would be counter-productive to reveal any more about our algorithms. Moreover, games are regularly updated, as a consequence, we must also update our “profiles” accordingly.
Are we the only ones to offer effective gaming anti DDoS protection? Today, few providers provide such protection. Nevertheless, it is not impossible to copy: it cost money, equipment and man hours. This necessitates a certain know how and probably recognition as a credible player on the gaming server hosting market, to cooperate more in the future with game developers. OVH has quite a differentiating argument with a 3 Tbps surplus maintained on its backbone. In relation to current customer usage, the OVH network is able to withstand, vacuum and mitigate a very large quantity of attacks.
Even the most advanced protection can never guarantee server availability if attacks saturate the network links upstream of the mitigation equipment. And this is the reason why developers operating their own platforms will run into more difficulties in the future. The maximum intensity of attacks is increasing (now at several hundred Gbps and several tens of millions of packets per second), resulting in two consequences for operators that do not have the network capacity to absorb the volume of such traffic: backbone saturation and interruption of service for all of their customers, and/or non-negligible financial consequences (transit fees to be paid for the excess traffic).

Will research and development of anti-DDoS Game benefit other OVH services or other business sectors that are online? Is this going to drive improvements in the “classic” (VAC) anti-DDoS?

It makes no sense to observe the outgoing traffic of the 200,000 servers hosted by OVH. VAC functions very well for the majority of attacks. But one could imagine other applications for the protection we specifically deployed for Game servers. For example, in the goal of improving the protection for VoiP servers, which also use the UDP protocol and are exposed to the same risks that a lot of game severs are. Or to protect SQL servers, some of which use the non-connected protocol (notably MSSQL). In the same manner, if we imagine a bidirectional anti-DDoS option, some services like video or music streaming could increase interested.
The long term plan is to combine anti-DDoS and routing within vRouter (an ongoing project) to simplify the architecture of the OVH network, ensuring better control and complete traceability. This progression requires a change in technology as it must be compatible with x86 architecture, which is not the case with Tilera.