Running programs and expecting them to behave is nothing, but a wishful thinking. When you start processing thousands of files you quickly discover that the reality of automated dynamic software analysis is quite harsh. Component and software dependencies, missing command line arguments, crashes, annoying nag screens, installers using non-standard GUI toolkits, software written for older OS, or frameworks, expired evaluation versions of software protection schemes, trials, evaluation copies of shareware, pranks, corrupted files, uninstallers, and many more make the samples misbehave.
Once executed, many samples simply exit – not necessarily in a very graceful way. Analysis fail.
There is no easy way to force these applications to actually run – typically, manual analysis are required to create a new behavioral rule (often with a patch) that will force this, and similar apps in the future to execute further, beyond the exit condition. Sometimes it’s not even possible. Notably, patches can be applied not only to the samples, but also to the analysis system (f.ex. install missing dependencies like a specific version of .NET, old-school OCX, old Borland files, etc.). It may be also necessary to bypass software protection schemes i.e. crack samples – legality of it is somehow shady.
To apply these patches and workarounds, one needs to analyze existing conditions that cause these samples to fail. Surprisingly, lots of them can be found by reading the message box captions and texts. As usual, this is not a trivial task since we deal with many languages, many different cases, and uncertainties, but it is possible. But patterns can be observed.
For starters, let’s look at expired software protection schemes. There are lots of malware samples where author used an evaluation version of the protection scheme with the aim of hiding the actual payload. When the sample is executed the protection scheme checks the conditions and if the evaluation period expired, it just prevents the app from running. One could argue that if evaluation / trial /unregistered version is detected, it alone is a good condition to classify sample as potentially unwanted, at least. Still, it does require signatures (either static or dynamic) to detect this sort of samples.
Here are some examples:
The missing component scenario is also very common. While Visual Basic and Borland C are no longer that popular, there are lots of old samples out there belonging to a software written in these old programming platforms. Sandbox should be expecting these in a queue…
Again, a few examples:
Large repositories of samples can’t escape programs written for localized versions of Windows. Running such applications on English OS leads to ‘garbage’ message boxes showing up with lots of gibberish that have no particular meaning and it’s hard to deduct what they mean, until analyzed.
Here is an example of such message box:
It turns out that it’s not a crash – just a message telling the user:
- 제거를 위해 모든 익스플로러창을 닫게됩니다.
which – after Google translation – says:
- To remove any Explorer window it will be closed.
I don’t speak Korean, but guessing by the GT output I assume it is just a notification the program will kill all (Internet) Explorer windows before it can remove some app. Whatever is the meaning – one has to ensure it is analyzed so that the sample (and potentially similar) actually works properly.
Many samples crash – this is overwhelming and I think the only way to handle this is to signal in the report that the app has crashed. Again, not a trivial problem to solve. You may detect Dr Watson launch, .NET crashes, or other default crash windows popping up, but you can’t expect them all the time – many frameworks handle crashes gracefully and as such, sandbox needs to recognize these properly. There are also samples I came across that don’t even indicate the crash – one needs to recognize it from the flow of the code execution i.e. program’s business logic following the ‘something is wrong’ path (f.ex. some installers do it).
If you think a crash detection would require a quick regexp on a couple of commonly used ‘crashing’ words (‘error’, ‘crash’, ‘corrupt’, etc.) – think again – here are some examples of such messages, and in reality there are hundreds, if no more variants:
Last, but not least – some malware intentionally shows fake message boxes. They may contain misleading information and may confuse naive engines looking for specific keywords or even phrases – relying on the messages alone is not enough to make the final call.
Yup, sandboxing can be preceived as pretty hopeless 🙂