Enter Sandbox – part 9: Message is in a bottle, and sometimes in a box

September 24, 2015 in Batch Analysis, Reversing, Sandboxing

Running programs and expecting them to behave is nothing, but a wishful thinking. When you start processing thousands of files you quickly discover that the reality of automated dynamic software analysis is quite harsh. Component and software dependencies, missing command line arguments, crashes, annoying nag screens, installers using non-standard GUI toolkits, software written for older OS, or frameworks, expired evaluation versions of software protection schemes, trials, evaluation copies of shareware, pranks, corrupted files, uninstallers, and many more make the samples misbehave.

Once executed, many samples simply exit – not necessarily in a very graceful way. Analysis fail.

There is no easy way to force these applications to actually run – typically, manual analysis are required to create a new behavioral rule (often with a patch) that will force this, and similar apps in the future to execute further, beyond the exit condition. Sometimes it’s not even possible. Notably, patches can be applied not only to the samples, but also to the analysis system (f.ex. install missing dependencies like a specific version of .NET, old-school OCX, old Borland files, etc.). It may be also necessary to bypass software protection schemes i.e. crack samples – legality of it is somehow shady.

To apply these patches and workarounds, one needs to analyze existing conditions that cause these samples to fail. Surprisingly, lots of them can be found by reading the message box captions and texts. As usual, this is not a trivial task since we deal with many languages, many different cases, and uncertainties, but it is possible. But patterns can be observed.

For starters, let’s look at expired software protection schemes. There are lots of malware samples where author used an evaluation version of the protection scheme with the aim of hiding the actual payload. When the sample is executed the protection scheme checks the conditions and if the evaluation period expired, it just prevents the app from running. One could argue that if evaluation / trial /unregistered version is detected, it alone is a good condition to classify sample as potentially unwanted, at least. Still, it does require signatures (either static or dynamic) to detect this sort of samples.

Here are some examples:












The missing component scenario is also very common. While Visual Basic and Borland C are no longer that popular,  there are lots of old samples out there belonging to a software written in these old programming platforms. Sandbox should be expecting these in a queue…

Again, a few examples:








Large repositories of samples can’t escape programs written for localized versions of Windows. Running such applications on English OS leads to ‘garbage’ message boxes showing up with lots of gibberish that have no particular meaning and it’s hard to deduct what they mean, until analyzed.

Here is an example of such message box:


It turns out that it’s not a crash – just a message telling the user:

  • 제거를 위해 모든 익스플로러창을 닫게됩니다.

which – after Google translation – says:

  • To remove any Explorer window it will be closed.

I don’t speak Korean, but guessing by the GT output I assume it is just a notification the program will kill all (Internet) Explorer windows before it can remove some app. Whatever is the meaning – one has to ensure it is analyzed so that the sample (and potentially similar) actually works properly.

Many samples crash – this is overwhelming and I think the only way to handle this is to signal in the report that the app has crashed. Again, not a trivial problem to solve. You may detect Dr Watson launch, .NET crashes, or other default crash windows popping up, but you can’t expect them all the time – many frameworks handle crashes gracefully and as such, sandbox needs to recognize these properly. There are also  samples I came across that don’t even indicate the crash – one needs to recognize it from the flow of the code execution i.e. program’s business logic following the ‘something is wrong’ path (f.ex. some installers do it).

If you think a crash detection would require a quick regexp on a couple of commonly used ‘crashing’ words (‘error’, ‘crash’, ‘corrupt’, etc.)  – think again – here are some examples of such messages, and in reality there are hundreds, if no more variants:









Last, but not least – some malware intentionally shows fake message boxes. They may contain misleading information and may confuse naive engines looking for specific keywords or even phrases – relying on the messages alone is not enough to make the final call.

Yup, sandboxing can be preceived as pretty hopeless :)

Enter Sandbox – part 7: Hello, مرحبا, 您好, здравствуйте, γεια σας

June 27, 2015 in Batch Analysis, Malware Analysis, Sandboxing

Most of modern applications use Windows APIs that rely on Unicode (or, at least its subset) and as such they rely on ‘W’ versions of the APIs as opposed to older apps that used ANSI ‘A’ versions (f.ex. CreateFileW vs. CreateFileA). Of course, the native APIs rely on Unicode for a long time. Unicode makes it easy and avoids ambiguities associated with the ANSI encodings which can always be mapped to many character sets – depending on the OS/application version. This is why running old localized applications on English OS leads to some unrecognizable garbage characters shown on the UI.

The number of old apps that rely on ANSI functions is still very huge and not taking them into account makes it harder to cherry-pick some interesting clues from the samples. Some of these clues can make it to the final report as well and actually enrich it a lot.

Let’s look at an example.

An application does something, and then displays a message box with a caption ‘Îøèáêà’ saying ‘Çàïðàøèâàåìûé ôàéë íå íàéäåí’.

Obviously, it doesn’t tell us much.

What if we attempted to translate it blindly into Unicode using the most popular ANSI encodings?

We would get sth like this:

1250 (Central Europe)           = Îřčáęŕ
1251 (Cyrillic)                 = Ошибка
1252 (Latin I)                  = Îøèáêà
1253 (Greek)                    = Ξψθακΰ
1254 (Turkish)                  = Îøèáêà
1255 (Hebrew)                   = ־רטבךא
1256 (Arabic)                   = خّèلêà
1257 (Baltic)                   = Īųčįźą
1258 (Vietnam)                  = Îøèáêà
 874 (Thai)                     = ฮ๘่แ๊เ
 932 (Japanese Shift-JIS)       = ホ碎
 936 (Simplified Chinese GBK)   = 硒栳赅
 949 (Korean)                   = 丘矮魏
 950 (Traditional Chinese Big5) = 昮魨罻

for the caption, and for the message:

1250 (Central Europe)           = Çŕďđŕřčâŕĺěűé ôŕéë íĺ íŕéäĺí
1251 (Cyrillic)                 = Запрашиваемый файл не найден
1252 (Latin I)                  = Çàïðàøèâàåìûé ôàéë íå íàéäåí
1253 (Greek)                    = Ηΰοπΰψθβΰεμϋι τΰιλ νε νΰιδεν
1254 (Turkish)                  = Çàïğàøèâàåìûé ôàéë íå íàéäåí
1255 (Hebrew)                   = ַאןנארטגאולי פאיכ םו םאיהום
1256 (Arabic)                   = اàïًàّèâàهىûé ôàéë يه يàéنهي
1257 (Baltic)                   = Ēąļšąųčāąåģūé ōąéė ķå ķąéäåķ
1258 (Vietnam)                  = Çàïđàøèâàǻûé ôàéë íå íàéäåí
 874 (Thai)                     = วเ๏๐เ๘่โเๅ์๛้ ๔เ้๋ ํๅ ํเ้ไๅํ
 932 (Japanese Shift-JIS)       = ヌ瑜籵褌隆 鴉 淲 浯鱠褊
 936 (Simplified Chinese GBK)   = 青镳帏桠噱禧?羿殡 礤 磬殇屙
 949 (Korean)                   = 행穽星外齧荏?牒雨 張 壯藕孼
 950 (Traditional Chinese Big5) = 瀔僤魤馲檞?邍澣 翴 縺毈樇

Even without the knowledge of the specific languages it’s easy to pick up the correct mapping which is ‘Ошибка’ (meaning ‘Error’) for the caption, and ‘Запрашиваемый файл не найден’ (meaning ‘File not found’) in Russian.

We can confirm it by running it on the Russian OS:


The exercise above my friend is an attempt to make a sandbox polyglottic. Add some modules to recognize the most common languages and who knows, maybe it will be able to recognize that these calls to FindWindow know no linguistical boundaries and are… not too friendly:

  • Скрытый процесс запрашивает сетевой доступ
  • Hidden Process Requests Network Access
  • Ein versteckter Prozess verlangt Netzwerkzugriff.
  • Un proceso oculto solicita acceso a la red
  • Un processus cache requiert une connexion reseau.
  • Внимание: некоторые компоненты изменились
  • Warning: Components Have Changed
  • Warnung: Einige Komponenten wurden verandert.
  • Advertencia: Los componentes han cambiado
  • Avertissement : Les composants ont change
  • Menedżer Zadań Windows
  • Создать правило для
  • Create rule for
  • Regel fur
  • Crear regla para
  • Creer une regle pour
  • 瑞星杀毒软件
  • 登录信息
  • 文件保护
  • 월드 오브 워크래프트
  • 삼국지
  • 하이로우2