1.3 Tbps mitigated by the VAC: a look back at the Memcached episode
On Thursday, March 1st, around 2:00 am (GMT+1), our VAC mitigated a DDoS attack which targeted one of our customers. It surpassed the 1.3 Tbps and at the same time eclipsed the previous record of 1 Tbps held by MIRAI. A few hours before that, was the target of yet another attack also reaching 1.3 Tbps. Unfortunately, the Github website remained unreachable for a few minutes.
These impressive attacks had one thing in common: the worldwide use of misconfigured Memcached services to launch what is known as “amplification attacks”. Here’s a look back at an emotionally charged week full of investigations.
Memcached: a default configuration as the root cause
Memcached is a key/value store type, open source distributed cache. It is generally used to improve performances, notably with web sites to save query results in a database, or for dynamic page rendering. This tool is widely used by webmasters and a few publishers who don’t hesitate to include Memcached in their solution. And that is the case with the Zimbra webmail service for example. But why does a service as commonly used as this one would suddenly start to derail?
As early as 2017, security researchers from Chinese company 360 have identified the possibility of a faulty configuration in Memcached, which allowed for amplified denial of service attacks. In fact, Memcached listens on a public interface by default during its installation. Consequently, in the absence of any firewall configuration rules, Memcached remains accessible to anyone on the internet from the public IP of a server. If the service only listened for TCP connections, the issue would be limited to data confidentiality since there is no authentication mechanism. Unfortunately, the service also listens on UDP and that’s where the situation gets complicated since it becomes possible to amplify with a factor of up to 51,000, whereas NTP only had a factor of 557.
Amplification attack: what is it?
Amplification is a phenomenon that aims at generating a response which is bigger that the request itself. Figure #2 shows an amplification factor of 2 since the response is twice as big as the request.
In order to unleash the full power of this phenomenon and use it for denial of service attacks, the attackers need to couple the amplification and reflection mechanisms. Reflection is the act of spoofing an IP address so that the response will be sent to the spoofed address. In fact, a server will blindly respond to the source IP of a packet, even if it’s been spoofed.
Reflection is a common occurrence in UDP because it’s a protocol known to work in “non-connected” mode. This means that, contrary to TCP, there’s no three-way handshake mechanism. So all that’s required is to send one packet to get a response from the recipient, whereas with TCP, at least 4 packets must be exchanged between the client and the server for them to agree on establishing a connection. If the IP was spoofed, the server’s response (SYN-ACK) would reach the victim, and whoever initiated the connection wouldn’t be able to confirm (ACK) whether the connection has been established or not, thereby preventing it from occurring and making IP spoofing impossible. To find out more, go the Wikipedia page that provides a reasonably good explanation of how to establish a TCP connection.
OVH’s action plan
Following this unusual increase in amplification attacks using Memcached, we have quickly implemented an action plan to ensure that we’ll be able to withstand very large attacks. Amplification is a vector that can be identified very easily. That being said, we must make sure that our interconnecting links to other operators (peering) don’t get saturated so that our quality of service doesn’t diminish. As shown in Figure 1, our action plan quickly proved itself when one of our customers was targeted by an attack of more than 1.3 Tbps that was perfectly intercepted by our VAC without any service interruption.
That plan also aimed at preventing any of our customers who are likely to encounter a misconfigured Memcached service from being unwittingly involved in some of these attacks. Here is what we needed to do:
• Communicate in a targeted manner with customers who could be potentially affected in order to offer them support by providing them with a guide to configure their service.
• Avoid waiting several hours/days for our customers to fix the issue, by detecting the OVH IPs used as reflectors and working directly on the network without degrading our quality of service.
After discussing the matter, we decided not to block the UDP port 11211 (the port used by Memcached) on our network since it can also be used to initiate communications as a “client” port. Blocking it could have a negative impact on the quality of service, notably for online gaming, VOIP, streaming and other services using UDP. We have also favoured a different approach where we put under mitigation IPs that receive a large number of Memcached queries per second and as such have been identified as contributing to denial of service attacks. By cleaning those incoming malicious queries, we were making sure that our customer’s service wouldn’t serve as a reflector.
On February 26, 2018, the first attacks we observed were being generated by “gets” type of queries as shown in Figure 4. Just like Cloudflare, we have noticed that this command could be used to read a key containing a zip file that was protected by a password and contained a gif file.
What was that zip file doing in those caches? We immediately thought that the targeted services had been prepared before being used for the attacks, and we are currently trying to identify the source. At the moment, we know that Memcached had already been used for several days to do some amplification. The graphs in Figure 5 illustrate the bandwidth of one of our customers’ server, which had been used to send attacks against Chinese IP “59.56.19.xxx” (we are voluntarily hiding the last octet) since February 24, 2018.
We realized that this IP was being attacked through amplification (LDAP, NTP, SNMP, etc.) on a regular basis and in a very synchronized fashion by machines from our own network. This phenomenon can be explained in several ways: the IP can either host a service that generates a lot of hate, or it was a test IP used to setup Memcached amplification attacks.
Upon a closer look, attacks against this IP are usually very short: from around ten seconds to a few minutes at most, whereas other attacks normally last a few long minutes, the most significant having clocked in at 3 hours. Many elements thus lead us to think that this IP is a “dstat”, which means a permanent IP that allows to measure the efficiency of a DDoS in real time.
We can therefore consider the fact that admins of the “booter” (DDoS sales platform) probably wanted to compare their various amplification attacks with the Memcached amplification. Among the tests that were conducted over the same time frame, we can see NTP (UDP/123), LDAP (UDP/389), SSDP (UDP/1900) and BitTorrent.
While continuing our research through DDoS specialized forums, we eventually found a rather explicit conversation (see Figure 6) where a user admitted that his DDoS sales platform was the first to provide Memcached amplification.
How the threat evolved
Within a week, things changed significantly: if at first the idea of Memcached amplification seemed to originate from a single “booter”, other managers quickly implemented the feature to market it to their customers. From the very beginning, our honeypots have revealed numerous ways of exploiting the Memcached services available on the Internet, as shown in Figure 7.
For some services, the zip file contained in the “a” key has now been replaced with an “abcdefghij” chain that repeats itself thousands of times over without any write attempts being detected.
We have also seen keys containing messages demanding to pay 50 XMR, a popular cryptocurrency also known as Monero. Could this be a ransom attempt targeting databases that only stock temporary data by definition? Or just a means to fill the cache in order to increase the amplification factor?
If a small number of open Memcached services were quickly corrected, there’s still quite a few of them that can be used to launch an attack.
Just like NTP, it seems that the Memcached threat will be around for a long time. Even though the protective measures implemented by web hosting providers have reduced the size of the attacks - most of them are now under 100 Gbps - a correction of the configuration by the users remains the key factor. Now, the NTP case has shown us that even after 5 years some servers are still vulnerable in spite of patches being available.
So as you finish reading this article, why not check the status of your firewall and take a look at which of your services are being exposed on the web?