I have had this idea for a while and today I finally implemented it. My implementation is pretty lame and only affects those monitoring solutions that rely on hooking or intercepting user-mode APIs, but after all – it’s just a proof of concept (plus, there is a possibility of implementing similar thingie in the kernel mode too).
So the idea goes as follows:
In a typical scenario, when an execution flow hits the address of the API (the one that sample is calling), the API hook takes over, or the monitor intercepts the fact API was called by recognizing that instruction pointer hits the known address of the API. Once it happens the monitor intercepts the function arguments and these will end up in the logs. Some monitors also intercept the moment when the API returns – this is handy as it allows to intercept buffers modified by the APIs.
Of course, the need to either hook APIs, or recognize when they are called is a subject to many evasions f.ex.:
- one can detect hooks
- checking if functions are starting with JMPs or short privileged instructions
- cross-referencing their code with a code read directly from DLL’s image (a file)
- one can use a legitimately looking call and trace it (using f.ex. TF=1) to see if the code is going through any region not mapped to a legitimate OS DLL (this usually means either malware, or some monitor is hooking APIs)
- one can bypass hooks/tracers/monitors
- using stolen bytes (length disassembler to copy code to a diff. buffer and call API via such a trampoline)
- one can call the address a bit earlier than API (typically APIs are preceded by a series of NOPs) – this is not a strong evasion per se, but it may fool a tracer/emulator trying to match call instruction operand with the list of actual addresses of APIs that this operand is potentially pointing to
- one can call the address a bit further than the API start (‘mov edi, edi’ is usually there and can be skipped; some common instructions f.ex. ‘push’ can be emulated)
- one can use a different API (lots of them do the very same thing)
- one can use functions inline
- etc.
It crossed my mind that one could instrument the code execution in a way that it would allow the monitor to intercept calls to APIs and their arguments, while it would quietly swap arguments ‘on the fly’ after it passed the monitoring stage.
So it would look like this:
- program calls an API
- monitor/interceptor takes control/recognizes API & logs its arguments
- right before the control is passed to OS to actually execute the API and do the thing it is supposed to do (e.g. create file), the argument (f.ex. file name) would get swapped to something else
The result would be that the report would contain a reference to incorrect argument(s) – suffering from falsified reality, because it failed to recognize a race condition.
One may need to choose carefully when the code needs to swap the arguments of the potentially monitored function. It’s not an easy task since there is a lot of things going on and finding an internal point in the food chain where this data could be swapped may be tricky. Many internal functions copy data from one buffer to another and once the data is copied it cannot be controlled/modified anymore in an easy way.
Assuming we can control the data we can come up with some ideas on how to mod it.
The simplest way could be to choose a ‘lower’ API in the food chain so f.ex. program could be calling Sleep API, while at the same time hooking NtDelayExecution to patch the data passed to Sleep with something else. Monitors logging Sleep would get fooled. Obviously, most of sandboxes nowadays monitor NtDelayExecution so it wouldn’t work in real life, but it’s just one of the ideas.
Another way to detect access to the data could rely on leveraging page access rights (e.g. page guard, or no access), or – alternatively – a null selector. This could guarantee that access to data (either specific one that we want to monitor, or _any_ data if null selector was used) would generate an exception. An exception handler could then swap the data somewhere ‘inside’ of the monitored API – still, it would be _after_ its arguments have been already logged.
In my PoC I used a simple method of monitoring execution of the API via tracing (TF=1); once the tracing exception handler finds the code transitioning execution to kernel mode (f.ex. sysenter instruction) it checks if the argument on the stack is coming from the API call we monitor; it it is the case, it swaps the data.
So, the flow goes like this:
- Set up an exception handler
- Enable tracing (TF=1)
- Execute NtCreateFile API with the file name \??\c:\good
- Monitor execution via tracing; the moment sysenter is detected, check if it is indeed our monitored NtCreateFile – if it is, swap the file name to \??\c:\evil
- OS should create \??\c:\evil file while monitors should report \??\c:\good was created
When the below test program is ran on the normal system (or inside VM) it creates the file \??\c:\evil.
What happens if it is monitored? I submitted it to a couple of sandboxes and ran it via some of the available API monitors and I got mixed results. Some of the API monitors got fooled. Some sandboxes I tested it with didn’t really show any results. Some monitors made the program create \??\c:\good file (which is an unexpected behavior; could be a bug in my PoC 🙂 ), and some sandboxes reported \??\c:\good as well, but I have no way of checking if the actual file was created, or was it my PoC actually fooling the monitor…
If you want to play around, here is the file.
Note: when you run it you need to kill it manually. This is because seeing that sandboxes didn’t report anything, I added to it a dummy loop that never ends. It creates 〰〰〰 file and then is sleeping for 70 miliseconds not to steal all the CPU cycles.