Enter Sandbox – part 3: If you see Native code is creative

Native functions are a very tempting target for monitoring as they are the core of the basic, atomic operations used by the OS. Observing them can give us a lot of juicy information about what is going on on the system + chances for evasions are low since most of the complex, high-level functions typically end up calling these OS ‘primitives’ anyway. Hooking/monitoring can be done both on kernel level (system-wide) and in user mode (process-wide); more exotic hooks can go deeper and monitor sysenter/syscall via Model Specific Registers (MSR), patch dispatcher functions e.g. KiFastCallEntry etc. – anything that participates in the transition between kernel and user mode can be monitored/patched/intercepted. There is also an extra layer for wow64 link between 32- and 64-bit layers inside 32-bit processes on 64-bit OS.

As I mentioned in the first part I am not a big fan of native functions hooking. More specifically, I do think it’s worth hooking these functions, but it is not necessary to always do so and there is also no need necessarily to output their logs into report all the time. The thing is that they are extremely noisy and they really lack of context. For the record, must emphasize here again that I am mainly focusing on manual analysis – the commercial sandbox should definitely look at everything and better be oversensitive and show more than ignore some important stuff.

If we look back at the list of APIs that I presented in part 1 and which are resolved using GetProcAddress you will notice that all of them are actually non-native APIs. They are just regular windows APIs.

There is a simple reason for it.

Most of malware is written in high-level languages and they leverage frameworks using predefined, well-structured and easy to use libraries. The functions malware writers rely on are imported statically and since they often use a copy&pasted code the result is that similar stuff is populated in gazillions of malicious projects written in Delphi, VB, AutoIT, .NET, etc. etc. While some of the malware families do leverage native functions it is not that common. It is important to say that native functions are not that difficult to use – they are just not that convenient – who would like to bother with data alignments, undocumented structures, or using NtCreateFile if there is an easy way – f.ex. fopen, CreateFile, etc.

For a reverser, seeing native functions being used by malware is typically a good news i.e. it means an interesting work; that means that someone on the other side of the fence at least made an effort to be creative and most likely wrote the code either in asm, or in C. Copy & paste exists of course too (and also ported to very high-level languages), but I’d argue that on a much smaller scale and mainly used by wrappers.

The code of malware families that leverage native functions is interesting not only for their functionality or technical craftsmanship, but also for the simple reason that being the creation of their intelligent authors it is probably the most personal code you will see in the software world (as juxtaposed by copy&paste efforts that is present all over the place f.ex. inside POS malware, but also in regular software which nowadays heavily relies on copy&paste from Stack Overflow). I could summarize this paragraph by saying that creative malware writers typically use native functions, and from a different angle – if you are looking for an interesting malware – look for the one that is using the native functions.

The post wouldn’t be complete if we didn’t list some of the most popular ntdll functions that are resolved during run-time (i.e. via GetProcAddress):

LdrFindEntryForAddress
NtUnmapViewOfSection
NtQuerySystemInformation
ZwQueryInformationThread
RtlInitUnicodeString
ZwUnmapViewOfSection
RtlNtStatusToDosError
NtMapViewOfSection
NtOpenSection
NtQueryInformationProcess
RtlAllocateHeap
RtlDecompressBuffer
RtlFreeHeap
RtlEnterCriticalSection
RtlLeaveCriticalSection
RtlGetLastWin32Error
RtlReAllocateHeap
RtlDeleteCriticalSection
LdrLoadDll
CsrGetProcessId
LdrGetDllHandle
RtlSetLastWin32Error
NtSetInformationProcess
RtlAddVectoredExceptionHandler
RtlImageNtHeader
LdrGetProcedureAddress
NtCreateThread
RtlAdjustPrivilege
RtlUnwind
LdrFindEntryForAddress
NtSetInformationThread
memset
VerSetConditionMask
NtUnmapViewOfSection
RtlRemoveVectoredExceptionHandler
memcpy
ZwClose
ZwQueryInformationProcess
NtCreateUserProcess
NtAllocateVirtualMemory
RtlUserThreadStart
ZwOpenProcess
NtWriteVirtualMemory
ZwQuerySystemInformation
NtClose
NtReadVirtualMemory
RtlImageDirectoryEntryToData
NtDelayExecution
swprintf
RtlSizeHeap

Enter Sandbox – part 2: COM, babe COM

API hooking, or interception described in part 1 is great for many analysis and works very well for many older generic samples, but to be able to handle modern samples sandbox needs to handle Component Object Model (COM) as well. COM is a bitch when it comes to analysis and hooking, because it’s omnipresent, not everything is properly documented, there are lots of ways to do the same thing and funnily enough – developers using COM make lots of mistakes and often incorrectly reference pointers. While their apps crash internally and exceptions are handled by the respective frameworks any intrusive sandbox will typically crash the application if it is not prepared to handle programmers’ mistakes.

When I say that the same thing can be done in many ways it’s for a simple reason. While COM objects are typically instantiated using e.g. CoCreateInstance, CoCreateInstanceEx, CoGetClassObject, or by actually calling some COM methods there is also a myriad of ‘regular’ APIs that can also instantiate COM objects – a simple example is PStoreCreateInstance.

COM is quite a mess and the deeper you dig the more weird stuff you will find (f.ex. interfaces changing names over time messing up your collection of CLSIDs).

Good luck handling it all…

Hooking COM objects requires either manipulating original virtual tables that are hidden inside the code/data of the COM object provider or dynamically – only inside the buffers allocated for instantiated objects. Whatever way, it sometimes is not welcome by the hooked applications which may have a code implemented to prevent COM hooking (I have seen this). Non-invasive interception is possible as well, but requires good tracking mechanism – some samples can call COM many times during the analysis session.

If you read that far you may be wondering, what COM objects we could hook and why it really matters?

Nowadays many malicious apps use various evasions, and lots of them are implemented using COM. A simple example is IBackgroundCopyJob used by FinFisher and attempting to copy files under the noses of sandboxes/AV. COM is also used to create/modify shortcuts, download stuff in a background using Background Intelligent Transfer Service (BITS) and other interfaces – and you may _not_ get to see URLs/domains contacted if you only rely on API hooking. Last, but not least – popular evasions rely on enumerating various properties using WMI and these are also handled via COM.

Not hooking this stuff leaves a lot of unanswered questions and limits the actionable data that can be extracted from the session.

This is an example of COM hooking in action:

  • Using ShellLink to create a shortcut file
    • CoCreateInstanceEx (ShellLink, IShellLinkA)
    • IShellLinkA::SetPath (%SYSTEM%\malware.exe)
    • IPersistFile::Save (C:\Documents and Settings\user\Start Menu\Programs\malware.lnk)
  • Using web browser object to download stuff
    • IWebBrowser2::Navigate (URL=http://xx.xx.xx.xx/media/1,Flags=,TargetFrameName=,PostData=,Headers=)
  • Using WMI to enumerate processes
    • IWbemLocator::ConnectServer (strNetworkResource=root\cimv2, user=, password=, locale=)
    • IWbemServices::ExecQuery (strQueryLanguage=WQL, Query=SELECT __PATH, ProcessId, CSName, Caption, SessionId, ThreadCount, WorkingSetSize, KernelModeTime, UserModeTime, ParentProcessId FROM Win32_Process)