Charting the real-world application of CTFs
Introduction
“If you know the enemy and know yourself, you need not fear the result of a hundred battles. If you know yourself but not the enemy, for every victory gained you will also suffer a defeat. If you know neither the enemy nor yourself, you will succumb in every battle.”
― Sun Tzu, The Art of War
In a recent blog, one of our team members gave a primer on the value of Capture the Flag (CTFs) challenges, along with an overview on the types of CTFs, some recommended tools to become familiar with and some general guidance and best practices to be successful at CTFs.
At Black Lotus Labs, we participate in CTFs to understand how threat actors could circumvent security controls, exploit software vulnerabilities and chain multiple attack techniques. CTFs also help us know ourselves as a team; we leverage not just the expertise we bring from our varied backgrounds in analysis, development, data science, reverse engineering and system engineering, but the new collective skills we gain along the way to ultimately protect organizations and critical systems against theft, fraud, abuse and disruption.
In other words, CTFs are not just games; they provide opportunities to learn and apply skills that have significant value in addressing actual security events. In fact, Black Lotus Labs has used some of the skills acquired in CTFs to track and disrupt real-world threats. Here are a few examples.
Scenario 1 – Web shells
Web shells, scripts that enable remote access to a web server, are frequently employed in both CTF challenges and actual security events, including multiple large-scale cyberattacks in the first half of 2021. In January and February, we learned web shells were deployed to facilitate data exfiltration from Accellion FTA appliances. In February, Microsoft discussed a rise in web shell attacks as an entry point and persistence mechanism. In March, it was revealed that advanced threat actors were leveraging web shells to maintain persistence after compromising on-premises Exchange Servers. Because of the scale of the problem and the perceived risk arising from unauthorized access to U.S. networks, the FBI obtained court approval to proactively remove web shells from infected Exchange servers. And in April and May, FireEye wrote extensively about web shells used to exploit vulnerabilities in Pulse Secure devices.
Workout at home gym
In August 2020, we participated in the DEF CON Red Team Village CTF. One challenge involved hacking the website of a fictitious gym. As you might have guessed from the context above, we emulated a threat actor by successfully exploiting the site using a web shell. To complete the set of challenges, we performed additional pivots to obtain a more persistent shell and extract data from the back-end database.
The website was using the Gym Management System, which was known to contain a remote code execution (RCE) vulnerability. A published exploit could be used to deploy a PHP-based web shell on the site. Ironically, one of our rookie team members simply downloaded and successfully used the exploit, while more experienced competitors who followed the customary scan, enumerate and exploit phases were unsuccessful. We later learned that the target was running Fail2Ban, which automatically blocked IPs that were observed to be scanning. (The challenge description at least hinted at this: “Research will serve you better than using Burp Suite.”)
Once we had a web shell on the web server, we identified the IP of the database server by using the lsof command and searching for the active default MySQL port. Then we searched PHP files on the web server to find the password of the root account on the database server. By executing a Python script from the web shell, we were able to use these credentials to open a socket between the database server and our local machine. At that point, we could execute arbitrary commands on the database server, such as accessing and querying the SQL database. This sequence mirrors real world attacks, where threat actors gain initial access on one host and then perform additional pivots to escalate privileges and move laterally to other hosts within the targeted organization.
Web shells in real security events
Because web servers are often exposed to the internet, they are often the initial target for an attack; by honing our skills in the fictitious gym exercise, we strengthen our ability to hunt malicious actors. At Black Lotus Labs, we use network telemetry like NetFlow from the Lumen global IP backbone and DNS resolution data from DNS resolvers to conduct our threat research and analysis, as well as to track threat actor activity like web shell deployment. While it’s complex to distinguish web shell traffic from legitimate web activity, there are at least a couple of ways in which malicious traffic could be identified. The first approach would be identifying unusual characteristics of the IP addresses connecting to the web server (e.g., originating from a country or ASN which is not expected or customary). In recent months we observed threat actors avoiding this detection technique by deploying infrastructure within the targeted country to increase their stealth. This was the case with ReverseRat, a new RAT we investigated with infrastructure based in Pakistan using infected domains hosted in India to target Indian energy and government organizations.
A second way to identify malicious traffic is by investigating the volume of flow. For example, after the Accellion FTA hacks using web shells came to light, we were able to find signs of data filtration from multiple victims in our NetFlow data. (Note that the ability to identify data exfiltration in NetFlow data applies generally, not only if performed via web shells.)
Examples of data exfiltration from Accellion FTA devices – more complete list on GitHub
In this example, other organizations had published C2 IPs for the hacking of Accellion FTA devices. When we looked in our data at the largest volumes of NetFlow for those IPs, the data exfiltration signal stood out strongly. We were able to validate the victim IPs were associated with Accellion FTA by reviewing data from Shodan. Data from a variety of organizations using Accellion FTA devices was stolen, and the Cl0p ransomware gang later attempted to extort them to prevent leaking the data. The rows above, sampled from the full list of targeted IPs, show the timeline for stealing data. The earlier window largely falls within a period from Dec. 21-29. We also saw another window from Jan. 20-21, which is consistent with Mandiant’s reporting of two rounds of exploitation using different vulnerabilities. NetFlow strongly complements analysis based on endpoint detection and logging, giving a holistic view of IPs targeted by the threat actors.
Scenario 2 – C2 communications
Black Lotus Labs previously blogged about a variety of botnets, including Mirai, Necurs, Emotet and TrickBot. Integral to being able to enumerate botnet infrastructure is understanding how malware functions and its role in infecting victims. The next series of challenges from the 2020 DEF CON Red Team Village CTF aligns very closely with how we track botnets and advanced threat actors by analyzing malware samples and reverse-engineering communications between bots and C2 servers.
CovidScammers
One set of challenges from the Red Team Village CTF, CovidScammers, tasked competitors to reverse engineer a malware sample, then identify and exploit the C2 server. Many of the challenges align closely with steps performed in analyzing real malware, such as finding C2 IP addresses or malicious domains embedded in a malware sample and determining details of how the malware persists on an infected machine.
The tools used to solve these challenges were identical to the ones we use to understand and track actual botnets, such as Base64 decoding, IDA, Python scripting and GDB. We had to decode files as well as circumvent anti-debugging features in the malware sample, which is often necessary to reverse real malware samples as well.
There are many ways in which such data can be encoded in a binary, and the process of extracting details can range from simple to complex. In the CTF challenge titled “This is nice, might stay a while…”, we were asked to find the SHA1 hash of the path of the persistence location. Some of the prior CovidScammers challenges could be solved using simple techniques such as running the “strings” utility on the binary and applying Base64 decoding. However, none of the strings matched the file location so we had to look at the disassembled code and understand its functionality.
Figure 1: Code disassembled from malware, showing the routine for decoding stack strings
The loop subtracted 0x58 from each byte in the stack string to decode it. Since there were more encoded strings which used the same method, we created a simple IDA Python script to decode a string at a given location.
Figure 2: IDA Python script for decoding stack strings
We used the script on each stack string until we found the path to the script “/etc/init.d/covid,” which ran at boot time.
In another challenge from CovidScanners, “License and Registration Please” we were first asked to give the SHA1 hash of a file path containing a universally unique identifier (UUID). We ran our decoding function for the string located at 0x804A49A and found this file path: ‘/run/lock/.serverauth.N7tItfiw4p.’ In the disassembled code which came after this stack string, we saw calls to file access functions and a function named ‘uuid_generate’, and the hash of the file path was the solution to this challenge.
A subsequent challenge, “PROTOCOL1,” asked us to identify the decoded string the malware uses to register itself with the C2. By analyzing the code and testing some of the embedded functions, we determined the correct function to register a new bot and used GDB to generate the bot ID. The bot registered with the C2 by sending an encoded message containing the command “REG” and the bot ID.
Figure 3: Message buffer for registering the bot
Enumerating real C2s
In tracking real threats on the internet, we at Black Lotus Labs often need to reverse engineer malware to understand its behavior. Our goal is to write a script which can pretend to the C2 to be a bot. If successful, the C2 will send malicious payloads and commands to our emulated bot. We thereby validate a suspected C2 while not actually carrying out the malicious commands. The challenges outlined above are a good analog for these activities. For example, when TrickBot infects a machine, the malware generates a unique bot ID with a tag indicating the manner of infection. The C2 then uses commands to gather information about the infected machine, update the bot software, download additional modules, change configurations and install additional malware such as ransomware.
Figure 4: TrickBot gathering information about the infected computer
Figure 5: Bot sending information to the C2 using WinHttpSendRequest
We verify that suspected C2s respond to our emulated bot with the same communication protocol we discovered from reverse engineering the malware sample.
As the challenge and TrickBot example above show, finding the C2 address from a malware sample is valuable, but at Black Lotus Labs we also find that machine learning models running across our network data are able to find suspected C2s with great success. This takes the value of reverse engineering outlined during the DEF CON CTF work and makes it scale well beyond what malware samples may be available.
Conclusion
As new vulnerabilities and exploits are brought to light, the TTPs used by threat actors constantly evolve. We’ve seen actors increasingly using offensive security tools like CobaltStrike or Metasploit, leveraging newer programming languages in malware such as Go or D and repurposing compromised domains to host malicious payloads and receive malicious beacons. By participating in CTFs, security practitioners can gain a better understanding of current attack techniques like these and better hone their skills in defense of the organization. As for Black Lotus Labs, we will continue to leverage our global visibility from the Lumen IP backbone to identify and disrupt threats. And we will be competing again at the upcoming DEF CON 29 Red Team Village CTF.
This information is provided “as is” without any warranty or condition of any kind, either express or implied. Use of this information is at the end user’s own risk.