Investigating a Penguin's memory for malicious activity
25 June 2020
Topics in this article
Memory analysis is crucial in a cyber incident – in understanding a variety of key queries but, most of all, determining whether or not any malicious activity is active and ongoing.
Nowadays, most malware is file-less, therefore remaining undetected by most anti-virus (AV) vendors as there’s simply no file signature available on disk.
However, malicious code must run at some point in memory. Machine learning and behavioral analysis can therefore be used to assist with the identification of malicious activity on systems. File-less attacks will often leverage trusted system tools already in place to carry out their actions, which will often disguise them and make the processes look legitimate.
Linux systems require a different approach
While this works well on a Windows system which facilitates this approach, it’s a different story for Linux systems. We’re going to look at how we can use a well-renowned memory analysis tool known as Volatility to uncover threat actor activity on a Linux Debian 7.11 (Wheezy) web server.
Machine learning and behavioral analysis can therefore be used to assist with the identification of malicious activity on systems
It's where the threat actor successfully exploited vulnerabilities to gain access to the system, escalate privileges and carry out other actions, while maintaining its undetected persistence and covering its tracks. The web server was found to be hosted on a virtual machine (VM), therefore it was a fairly straightforward task to extract the .vmdk, .vmem and .vmsn for interrogation. If this had been a live system, then a tool such as LiME could’ve been used, which itself is a loadable Linux kernel module (LKM), designed to capture memory.
One of the main issues which analysts often encounter with Linux virtual memory is the creation of a Volatility profile suitable for the memory. Why is this important? Well, Volatility needs to be able to interpret the memory structure correctly and the kernel version has a big part to play. In our case, we were able to extract the kernel file from the /boot/ directory, called System.map-3.2.0-4-686-pae. The numbers represent the kernel version, followed by the major revision, followed by the minor revision and finally the bug fix respectively.
Volatility only has a predefined list of profiles so, if a suitable profile is not available on GitHub, then you’ll likely need to create one, using module.dwarf. In our case, we utilized module.dwarf to create a suitable profile, using our System.map file, extracted from disk. One of the first things I often like to look into is linux_bash. In essence, this will recover bash history from memory. It’s very useful, especially if during disk analysis, bash history was deleted. Memory dumps are a gold mine for historic bash history!
A series of commands associated to process ID (PID 32049) were executed and can be translated, as per below:
- Viewed passwd user accounts
- Deleted root user .bash_history
- Accessed /tmp/ directory and listed all files + hidden
- Changed into /bun/ directory and listed all files + hidden
- Installed module suterusu.ko
- Ran sock 11 command, to hide the directory folder bun
- Payload was created
- Permissions were changed to make payload executable
- Ran sock 1 command, to hide the payload payl process id (PID 32198)
- Sock tool and module were removed
In summary, we’ve identified a series of suspicious executions. We’ve witnessed deletion of bash logs, installation of an unknown LKM, a payload being executed and hidden in memory and finally toolsets removed. This is all firing off alarm bells!
Let's take a closer look at PID 32198 which is a payload file running in memory via linux_pslist. We can determine that the payl is active and running and by backtracking the parent process ID (PPID 32049), we can verify it was in fact spawned via bash history. Again, we can backtrack further (PPID 32047) to determine that this was executed via sshd (PID 32047), therefore we have someone SSH onto the terminal.
At this point, the payload was dumped via linux_procdump, to allow for further analysis. The SHA256 hash values were also collected for the purpose of remediation.
Active threat actor
Now we know we have an active threat actor on the system, we can check the network connections to confirm this via linux_netstat. As we can see, there’s an active connection to another internal IPv4 address, used to SSH onto the system and furthermore, we can see the payload listening on port 6789/TCP.
It was also possible to locate historic traces of the same IPv4 address, in the routing cache via linux_route_cache. This could indicate it’s been around for some time and we now have some historical evidence of network connection activity.
Threat actors will often leverage malicious rootkits and install them onto compromised systems to enable persistence and command and control (C2) activities. As we saw before, we have a payload listening on port 6789/TCP, which is an unknown process, spawned by a module known as suterusu. In this instance, we ran a check against all common modules installed on the system via linux_check_modules, however nothing notable jumped out.
However, there are times where malicious modules will often hide within memory and for that we can use linux_hidden_modules, which does deep diving into memory space, to try and identify other module structures. In this instance, we located the suterusu.ko, which was dumped to disk via linux_moddump, to allow for further analysis.
Once we’ve identified the rootkit, we can investigate what actions it carried out
Now we know we have identified the rootkit which was deployed being suterusu, we can carry out further checks to determine other actions it carried out. Briefly, we identified suspicious hooking via the linux_syscall command. This would allow the rootkit to hijack other legitimate functions and instead use malicious code associated with it. Further digging on this entry can be undertaken to understand the inner workings of the rootkit.
Much like a Windows memory image, where it's possible to retrieve the active master file table ($MFT), in Linux we can carry out a review of files loaded in memory and review them for anything notable. In this instance, we know specific directories which are of interest and were able to dump a list of precisely 27,766 entries via linux_find_file for review.
Of major significance was a file path relating to dirty.c and since we now had its offset, we could dump it to disk for further inspection.
Dirty COW Exploit Snippet
The file contains C code which is likely to have been compiled to allow for privilege escalation via the Dirty COW exploit, as referenced in CVE-2016-5195. This is an exploit, which our system is susceptible to.
In simple terms, this specific C code once compiled and executed will back up the passwd user accounts file located in /etc/, modify the file as required for the root user account, populate a new password for the root user in the shadow file located in /etc/ and finally restore the newly created passwd.
As a result, this permits root user access for the threat actor. During the course of the investigation, it was determined that the threat actor had successfully uploaded several malicious PHP web shells to the web server via known vulnerabilities associated to arbitrary file uploads, to allow for remote code execution.
Following a review of the Apache log files, it was possible to identify the upload of the PHP webshells and subsequently, the actions associated to Dirty COW. In simple terms, the log file depicted the following for Dirty COW:
- encrypted payload being uploaded
- C code being compiled
- binary being executed
- root access being gained
While uncovering log files were not in scope of this post, it has to be said they provide a crucial role and once coupled with results of memory analysis, it’s possible to paint a full picture and provide a timeline of the malicious activity. It didn’t take much to pull some very useful artefacts from memory and, as seen, it appears that a lot of the useful information identified may not always reside on disk, which is why it's of vital importance to obtain memory dumps for analysis on any suspected 'patient 0' endpoints.
It can be very useful to compare different methodologies used by threat actors during the analysis phase and, as depicted below, it is possible in this scenario to see how each of the Cyber Kill Chain functions were achieved.
|Upload webshell and dirtyc
|Remote code execution
|Linux kernel module rootkit
|Command & Control
|SSH remote access
|Actions on Objectives
|Goals of attack
|Exfiltration of data
It should be said, the threat actor in this instance accomplished many of the tactics, techniques and procedures (TTP), depicted in the MITRE ATT&CK framework.