Enter Sandbox – part 16: The symbols, the ApiSetSchema, and other possible future evasions

It’s been a while since I wrote about Sandboxes and I thought I will revive the series by listing a couple of ideas that I believe may be still under the radar of both sandbox solution creators, and reverse engineers.

Symbols

When we analyze Windows native OS libraries one of the most useful features we have at hand is leveraging debugging symbols. IDA, windbg, and many other reversing tools can use these symbols very efficiently – with symbols in place we can see names of internal functions, variables, and this code/data enrichment provides us with an invaluable context that speeds up the analysis.

Now, since the tools can use it, there is nothing that could stop malware from… doing the same.

Imagine the possibilities!

Instead of relying on export/import tables, a clever malware that leverages symbols could make calls to internal functions, or find ways to hook code deep inside the functions that are typically monitored (at least on the userland level). Symbols could also help to detect caves i.e. areas of code/data that are rarely used, and allow malware to overwrite them and persist in a much stealthier way than usual. There is a lot of potential for surgical modification of code to launch the payload code using the EPO (Entry Point Obscuring) technique.

Note that while there could be a need to download these PDB files directly on the infected system, nothing stops malware from sending copies of the DLLs from the system to its server first, decompiling / disassembling them on a remote server.  The malware could then craft the payload using hardcoded offsets obtained via symbols to deliver the required functionality. A side effect of such trickery would be that such crafted payloads would not run on sandboxes, and manual analysis would typically fail unless the file was analyzed on the exactly same system (meaning: with the same versions of libraries as existing on the victim’s system); obviously, ASLR needs to be taken into account for all offsets and calls. Another caveat is that such approach would require constant monitoring of Windows Update service; if files are replaced, the code would need to be updated to the most up to date version that works with the new libraries.

Leveraging functions of ApiSetSchema libraries

Now that we have not only kernel32.dll, but also KernelBase.dll, and the whole api-ms-*.dll zoo, it is possible to call these wrapper functions instead of the pure exports from kernel32.dll, advapi32.dll, etc..

Leveraging internal/undocumented functions

These can be either located via symbols, or via ApiSetSchema libraries; a quick browsing through internal functions referenced by kernel32.dll reveals a lot of interesting possibilities e.g.:

  • PrivCopyFileExW – undocumented function that allows to copy files
  • A couple of exported Internal functions that can be potentially used as callbacks (to run shellcodes, payload code, etc.)
    • Internal_EnumCalendarInfo
    • Internal_EnumDateFormats
    • Internal_EnumLanguageGroupLocales
    • Internal_EnumSystemCodePages
    • Internal_EnumSystemLanguageGroups
    • Internal_EnumSystemLocales
    • Internal_EnumTimeFormats
    • Internal_EnumUILanguages
  • A couple of exported Internal functions that can be used to access both the Registry and the File System bypassing documented APIs:
    • RegCreateKeyExInternalA
    • RegCreateKeyExInternalW
    • RegDeleteKeyExInternalA
    • RegDeleteKeyExInternalW
    • RegOpenKeyExInternalA
    • RegOpenKeyExInternalW
    • ReplaceFileExInternal

There is certainly more.

New technologies (well, sometimes not that new)

While it doesn’t really rely on symbols, it does fit the topic of the article – there is a class of APIs that are not commonly used yet, but certainly will be heavily utilized in the future:  I am talking about enclave functions; as per MS:

An enclave is an isolated region of code and data within the address space for an application. Only code that runs within the enclave can access data within the same enclave.

Example functions:

So, you got yourself yet another API set for both code injection and protection.

I am obviously not the first one to highlight it; a paper from 2015 (3 years ago!) by Alex Ionescu covers the topic and lists a number of possible issues enclaves bring to the world of security solutions including AV, EDR, and perhaps memory acquisition tools.

Turning the MAX_PATH length to 11

In my recent post I described how NTFS Sparse files could be used to make life of reversers, forensic investigators, and sandbox developers a bit more difficult. There are  more way to to make their life more miserable and here I describe another potential one.

Every once in a while I re-read the Naming Files, Paths, and Namespaces article on Microsoft page. I do it because I don’t remember all the gory details and I know that this page is updated every once in a while so sometimes you can find some new content there.

New content == new ideas.

While the change I refer to is not very very new, it is very interesting especially from the reversers’ and sandbox developers’ perspective:

Maximum Path Length Limitation

In the Windows API (with some exceptions discussed in the following paragraphs), the maximum length for a path is MAX_PATH, which is defined as 260 characters.

[…]

Tip Starting in Windows 10, version 1607, MAX_PATH limitations have been removed from common Win32 file and directory functions. However, you must opt-in to the new behavior.

A registry key allows you to enable or disable the new long path behavior. To enable long path behavior set the registry key at HKLM\SYSTEM\CurrentControlSet\Control\FileSystem LongPathsEnabled (Type: REG_DWORD). The key’s value will be cached by the system (per process) after the first call to an affected Win32 file or directory function (list follows). The registry key will not be reloaded during the lifetime of the process. In order for all apps on the system to recognize the value of the key, a reboot might be required because some processes may have started before the key was set.

The registry key can also be controlled via Group Policy at Computer Configuration > Administrative Templates > System > Filesystem > Enable NTFS long paths.

This is very interesting.

If there is no limitation, malware could abuse it relying on the fact limitations apply to all software that is not aware of this change.

For instance, if sandbox is copying files from the sandbox to external world, using a script/program that runs inside the sandbox, and does not use the extended-length path’s prefix \\?\ when it accesses files it will fail to copy files out as it won’t ‘see’ them (they are too deep). Notably, the very same limitation applies to the old trick where the known device name is used as a file name and such file (e.g. c:\con) can only be accessed via that extended-length path’s prefix.

But are all sandbox developers aware of this? (unless they use external access to the sandbox disk using e.g. their own NTFS parser)

And as for reversers… if the tool they use to browse the system, open the file don’t use the extended-length path’s prefix they will find themselves unable to browse/open files inside these deeply nested folders.

And if you are a forensic investigator, there is yet another Registry key to learn about:

HKLM\SYSTEM\CurrentControlSet\Control\FileSystem
LongPathsEnabled

So, not entirely new (the c:\con trick is at least 10 years old), but perhaps new to you.