Enter Sandbox part 21: Intercepting Buffers #2, Abusing the freedom of buffers, after their release

December 22, 2018 in Batch Analysis, Sandboxing

In my last post I mentioned the magic word ‘buffer’. If you follow the series, you now know that the strings are great buffers to look at, and… that ‘there is more’.

There is indeed.

If Steve Ballmer was at some time a CEO of a sandbox company I bet his famous Developers video would be now known as Buffers.

Apart from monitoring string functions, the most successful results in my dynamic malware analysis came from monitoring of selected memory-oriented functions. And no, no, not these that allocate the memory only, but these that actually release the previously allocated memory blocks, or these that change their security rights.

You see, most of the malware using in-memory payloads or encrypted configs follows a very well established pattern:

  • allocate memory
  • unpack/decompress payload/config to it
  • use some runPE module to resolve the imported APIs (if code)
  • transfer execution to the new entry point (if code)
  • in some cases temporary buffers are used, and they are often freed after use

This pattern is highly prevalent.

This pattern almost begs for us to start monitoring the VirtualFree and VirtualProtect functions.

Why?

At the time VirtualAlloc returns, the block will be allocated, but there is nothing in it. At the time the VirtualProtect or VirtualFree APIs are called, the actual juicy code/data is already there. Most of the time.

As a side note: obviously, you can, and should expand the described monitoring coverage to NT functions as well (NtFreeVirtualMemory, etc.).

You may not believe it, but before malware authors become malware programmers, they were just… well… programmers. And I don’t know any C programmer who wouldn’t be taught these two fundamental principles of memory management during their C training courses:

  • if you allocate some memory, you must free it when no longer needed
  • if you change permissions, there must be a reason for it i.e. the memory block is already filled-in with something

The second one I actually made up, but it not only fits my narrative – it is actually aligned with the way most of malware is written. Allocate a buffer. Copy stuff to it. Change Permissions. Execute it/Interpret it/Use it (e.g. decrypted config). Or, move it somewhere else.

In my experience there are not that many malware programmers who can disengage from these two code patterns. Obviously, it only applies to old-school programming languages where it actually matters how you utilize memory. And nicely written shellcodes.

Okay. So… the moment the buffers are freed, or their permissions are changed, we can jump on it, dump it, and harvest the juicy code/data.

There are even more good news.

In case the malware author doesn’t free buffers, doesn’t change permission there is a hope. At the time the memory is allocated, you can set up an event that will trigger the monitor to dump the memory of the allocated buffer.

For example, you can dump the block when:

  • instruction pointer is within the previously allocated block (i.e. the code is being executed from a dynamically allocated buffer!)
  • certain amount of time passed after allocation
  • checksums of the memory block change
  • certain number of memory writes to the region occurred
  • a number of APIs were called after allocation
  • network connection was initiated (i.e. payload is ‘working’)
  • program terminates or crashes
  • etc. etc.

And there are even more good news.

Many script/code obfuscators that rely on hiding the code using the security-by-obscurity tricks are coded in high-level programming languages. These make extensive use of the heap functions when they deal with memory blocks.

A-ha.

While Virtual memory functions are cool to monitor, what about heap functions?

Bingo.

In most cases you can access the hidden code/data processed by these ‘obfuscators’ almost instantly. Just wait for the functions that release the memory from heap to be called, and just dump the content of these allocated, but no longer needed memory blocks.

Also, if you are wondering about monitoring GlobalAlloc & LocalAlloc APIs and their respective GlobalFree and LocalFree memory releasing functions, at this stage they are just wrappers for heap functions. You can of course monitor them separately too (may help in malware family fingerprinting).

We mentioned virtual memory functions (served by both win32 and NT APIs), heap functions, what about the stack?

Yes, by all means.

This is again yet another great source of intel. If you use debugger on regular basic you know that the stack is a great source of information. If you can build a tree of calls that led to e.g. crash you have a lot of information to investigate and troubleshoot the issue.

And it can be extended to a data buffer inspection, e.g. looking at the local variables of a calling code.

Anytime you intercept an API call, you can inspect the stack buffers and see what interesting information can be found there. Again, very often it will be strings of any sort (including these that were built on the stack using more obfuscated code), pointers to strings, pointers to pointers to strings, offsets to structures in memory, sometimes functions callbacks, etc. All of it can support the manual analysis a lot.

Listing hexadecimal values of what is currently on the stack (and buffers some stack values point to) before and after API is called is really useful (meaning: e.g. 10-20 dwords/qwords beyond the actual API arguments that can be of course interpreted easily, because we know what arguments are passed to APIs).

Natural progression will take us towards more obscure areas. Hooking of malloc, free, calloc, new, various constructors, and COM-oriented stuff e.g. CoTaskMemAlloc and CoTaskMemFree functions, COM interfaces, etc..

The scope is very big.

Is it worth it?

Yes, this trick worked for me under a number of occasions, and primarily… it saved me a lot of time; instead of trying to reverse engineer the whole thing, I would just wait for these functions to be called, dump the code, edit it a bit, beautify, and analyze.

And if you ever used Flypaper from HBGary, you gonna love it. Sandboxes offering such granular API-interception level, or even inline-function monitoring takes what FlyPaper did to the next level. You will see as many buffers as possible. You can inspect them on a timeline. You can literally copy&paste stuff out of them: actual code, configs, decrypted URLs and other IOCs, and can break apart C2 easier as well.

And last, but not least. The difficult part.

When we talk about monitoring memory functions, there is one caveat I need to mention. Most of these functions, when intercepted, will require you to estimate the size of the buffer that is being released. You need the proper size, or you will be dumping never-ending pages of random memory data. Trust me, w/o a proper size you will be dumping hundreds of megabytes of garbage.

For strings, you can calculate the length, or use predefined structures that hold the length of a string buffer. For generic memory buffer it’s much harder. You may of course use various heuristics, exclude padding zeroes, etc. but… the best is to obtain the actual, real size of the buffer.

I can think of three approaches here…

You track memory allocation functions and register the requested sizes, and track their changes (e.g. realloc functions). Pretty hard to do.

Or…

You can interpret the actual memory of the process to calculate the size.

Or…

It’s really handy if your sandbox monitor can actually call the dedicated API functions that can provide this information ad-hoc, and within the context of a given process, thread. So, your callback for ‘free’ function is called, you call the ‘give_me_the_size_of_this_block_given_the_address” function. With the retrieved size, you can dump the properly sized buffer.

For instance:

  • for heap functions you can call RtlSizeHeap
  • for Virtual memory functions you can call VirtualQuery
  • for COM functions, you can call respective APIs or methods, if they exist
  • for any high-level-language wrappers you need to find (often inline) a wrapper function that will tell you the size of the allocated buffer based on its address

It’s very invasive, somehow expensive, but works really great.

To conclude… buffers are everywhere and it’s worth looking at them, collecting them, and offering them to analysts:

  • Any sort of memory functions that are documented Windows API
  • Any sort of functions that are detectable inline (statically linked libs, Delphi, etc.)
  • Mapping files and sections, Unmapping files and sections.
  • Crypto functions (CryptDecrypt, CryptEncrypt, CryptDeriveKey, CryptHashData, CryptProtectData).
  • File Writing, File Reading functions. File Seeking functions.
  • Internet Read, Write functions.
  • Copying memory buffers
  • Filling in memory buffers with zeroes or other values
  • Compression/Decompresion, built-in, and well-known copypasta code (or, family-based) that can be hooked inline
  • Encoding/Decoding, as above
  • Database queries
  • WMI queries
  • String operations of any sort, including translation (Unicode->MBCS, DBCS, ANSI, and vice versa)
  • Hash calculation – in, and out buffers (e.g.A_SHAFinal|Init|Update)
  • Resource buffers
  • GUI elements (not only desktop screenshots, but also window elements, including invisible ones, icons, bitmaps, menus, dialog boxes, property sheets, etc.)
  • Bitmaps of any sort (BitBlt, StretchBlt, etc.)
  • Video buffers of any sort (capCreateCaptureWindow)
  • DirectX/OpenGL buffers
  • Console buffers
  • MessageBox buffers
  • Programming language-specific buffer APIs (e.g. VB __vbaCopyBytes)
  • and tones of others

It all asks for an interception.

It all asks for depth of analysis that goes beyond your regular sandbox output.

It all asks to be configurable.

Modern Sandboxes intercept a lot of artifacts created by the samples. I value the most these that can actually preserve not only information about high-level artifacts, but also full snapshots of file content, Registry buffers, network operations, and memory dumps, including properly dumped PE files, where available, as well as windows. The more the merrier.

Sandbox as a tool to determine whether sample is bad or good is old news.

Sandbox that actively supports the reverser who will take all these dumped buffers and will finish the analysis of the sample is much better news.

And what’s in it for sandbox companies?

Better detection capabilities. Expansion of the audience from just analysts to hardcore reversers. Expansion of the possible market to QA/QC/test labs. Providing a black-box support for debugging, localization.

Perhaps simple ‘being ahead of a curve’, too?

Comments are closed.