Hexacorn

Observing a malware is one thing. Observing the very same malware in a rich context is another.

The traditional approach to sandboxes focuses on scoring the sample’s badness, extracting IOCs, and not focusing that much on the in-depth analysis. It’s understandable, because in-depth analysis are not the ultimate goal. Still… being able to extract more information that may help with the manual analysis is always welcome. And it’s actually getting better – the competition is slowly changing the landscape and newer sandboxes support memory dumping, PE file rebuilding, show nice process / thread trees, various graphs, etc… and place more and more hooks in place. And then again, even if they intercept the most popular APIs, inline functions, or even intercept virtual tables, it may still not be enough.

I thought, what would happen if I intercepted not only the most popular APIs that are used by malware, but also these that are less-frequently looked at, and in particular, these that may help to understand a flow of events in a better context – enriching the data that sandbox presents and making the in-depth analysis easier.

What are these APIs?

Let me show you an example…

Imagine you intercept the function CreateToolhelp32Snapshot to take a note of the fact that the malware is enumerating processes. This may add to the ‘badness’ weight, but on its own is not a malicious feature per se. Lots of ‘clean’ processes enumerate processes.

What if we not only did that, but also intercepted Process32First and Process32Next?

This could be the result (output is simplified to demo the idea):

CreateToolhelp32Snapshot
Process32First: [System Process]
Process32Next: System
Process32Next: smss.exe
Process32Next: csrss.exe
Process32Next: winlogon.exe
Process32Next: services.exe
Process32Next: lsass.exe
Process32Next: svchost.exe
Process32Next: svchost.exe
Process32Next: svchost.exe
Process32Next: svchost.exe
Process32Next: svchost.exe
Process32Next: spoolsv.exe
Process32Next: explorer.exe
Opens Process: %WINDOWS%\explorer.exe
VirtualAllocEx: %WINDOWS%\explorer.exe
NtWriteVirtualMemory: %WINDOWS%\explorer.exe
VirtualAllocEx: %WINDOWS%\explorer.exe
NtWriteVirtualMemory: %WINDOWS%\explorer.exe
VirtualAllocEx: %WINDOWS%\explorer.exe
NtWriteVirtualMemory: %WINDOWS%\explorer.exe
VirtualAllocEx: %WINDOWS%\explorer.exe
NtWriteVirtualMemory: %WINDOWS%\explorer.exe
VirtualAllocEx: %WINDOWS%\explorer.exe
NtWriteVirtualMemory: %WINDOWS%\explorer.exe
CreateRemoteThread: %WINDOWS%\explorer.exe
NtResumeThread: %WINDOWS%\explorer.exe

Analysing a log like this tells you straight away that the malware is enumerating processes, and when it finds explorer.exe, it injects a bunch of buffers into it (possibly mapping sections of the PE payload?), and then creates a remote thread. As a result, the explorer.exe process now is hosting malicious payload.

While the code injection into explorer.exe can be deducted from manual dynamic analysis, or may be even obviously apparent when we are evaluating the process tree and network connections from a report generated by a sandbox, there is a subtle difference. The context these 2 additional intercepted APIs provide allows to be quite certain that the malware is actually quite specifically looking for the explorer.exe, and not for the other process.

It also tells us HOW the process is found.

And mind you, this is actually not a trivial question if you are doing in-depth malware analysis.

There are cases where this determination is very important. Having an ability to quickly determine if we are missing some target process on the test system can save us a lot of time spent on mundane manual analysis. This is actually one of the first questions your customer will ask you, especially when it comes to targeted attacks. It is a very responsible job to deliver the results and not to miss stuff!

When you look at malware that is highly targeted, f.ex. malware that is targeting Point of Sale systems, running it through a sandbox may _not_ give you any good results, because you either won’t see the process enumeration at all, or may miss the name of the process that the malware is looking for. The malware will look ‘broken’ to us. I can’t count how many times I wasted time on manual analysis and even incorrectly concluded that the malware is ‘broken’ while looking at heavily obfuscated, or bloatwarish malware samples. Until I started looking at the context of the early exit.

It is really helpful to be able to cheat a bit.

For the case of the process enumeration one can not only intercept the Process32First and Process32Next functions, but also enhance the results with the interception of string comparison functions.

If we get lucky, the result could look like this:

Process32First: [System Process]
lstrcmpiA ([System Process], explorer.exe)
Process32Next: System
lstrcmpiA (System, explorer.exe)
Process32Next: smss.exe
lstrcmpiA (smss.exe, explorer.exe)
Process32Next: csrss.exe
lstrcmpiA (csrss.exe, explorer.exe)
Process32Next: winlogon.exe
lstrcmpiA (winlogon.exe, explorer.exe)
Process32Next: services.exe
lstrcmpiA (services.exe, explorer.exe)
Process32Next: lsass.exe
lstrcmpiA (lsass.exe, explorer.exe)
Process32Next: vmacthlp.exe
lstrcmpiA (vmacthlp.exe, explorer.exe)
Process32Next: svchost.exe
lstrcmpiA (svchost.exe, explorer.exe)
Process32Next: svchost.exe
lstrcmpiA (svchost.exe, explorer.exe)
Process32Next: svchost.exe
lstrcmpiA (svchost.exe, explorer.exe)
Process32Next: svchost.exe
lstrcmpiA (svchost.exe, explorer.exe)
Process32Next: svchost.exe
lstrcmpiA (svchost.exe, explorer.exe)
Process32Next: spoolsv.exe
lstrcmpiA (spoolsv.exe, explorer.exe)
Process32Next: PERSFW.exe
lstrcmpiA (PERSFW.exe, explorer.exe)
Process32Next: explorer.exe
lstrcmpiA (explorer.exe, explorer.exe)

That makes the in-depth malware analysis supereasy, doesn’t?

I think there is a potential market for supporting in-depth malware analysis with sandbox technology – make the interception configurable (offer a list of APIs to monitor, allow time to run to be selected manually, rebuild files, perhaps give live access to the analysis box, etc.).

Reversing ykS is the limit.

And while I do commercial in-depth analysis and I may be shooting myself in a foot here, I can’t stress enough how important ROI is for both you and the customer.

Hexacorn

A couple of interesting MD5 hashes

Enter Sandbox – part 15: rE[mn]u[mn]eration games