You are browsing the archive for File Formats ZOO.

Enter Sandbox part 25: How to get into argument

June 11, 2019 in File Formats ZOO, Malware Analysis, Sandboxing

When you begin your programming career one of the first lessons focuses on reading command line arguments. It is very trivial, but when you start coding more and in new languages you will quickly discover that it’s actually less than trivial and a bit of a mess.

Programming languages use many different ways to access the command line arguments, e.g.:

  • argv
  • wargv
  • args
  • $argv
  • @ARGV
  • arg
  • sys.argv
  • ParamStr
  • Command$
  • WScript.Arguments
  • etc.

I can’t count how many times I googled proper name/syntax for these over the years – ad hoc programming in different languages makes it quite difficult to remember. Also, some programming languages start indexing of arguments from 0, some from 1.

A way to access these parameters also differs. Sometimes you have it available as a string, an array, sometimes you need to call a function to retrieve specific items for you, and in some cases you need to write your own parser or tokenizer.

And finally, some frameworks require certain (standard) approach to passing arguments so that a (standard) parsing routine can extract them properly. Then there are quirks – paths with spaces, extra spaces, ANSI, Unicode characters, and you have two buffers available for parsing – a path to actual executable, and its command line. And the first is not always a full path, or is a path expressed in a different way than expected.

It gets even more complicated when you start reversing. This time it’s not only programming languages per se, but also the binaries they produce and these differ depending on architecture, OS, compiler’s flavor, version, optimization settings. It is all very messy.

Grepping a repo of import function names I came up with this short list of APIs & external, or internal symbols/variables:

  • CommandArgs
  • CommandLineToArgvW
  • GetCommandLineA
  • GetCommandLineW
  • g_shell_parse_argv
  • osl_getCommandArg
  • osl_getCommandArgCount
  • rb_argv
  • StringToArgv
  • _acmdln
  • _wcmdln
  • __argc
  • __argv
  • __p__acmdln
  • __p__wcmdln
  • __p___argc
  • __p___argv
  • __p___wargv
  • __wargv

Why would we need these?

Many programs require command line arguments to run. Sandboxes that can’t recognize these will fail to produce an accurate report. Not only some malware is using this trick on purpose, there are also tones of good programs that end up in sandbox repositories and never get properly analyzed (e.g. compiled work from students of IT, or native OS binaries)

Sandboxes that recognize programming frameworks & the way they parse command line arguments are in a better position to analyze such samples. This is because there is at least a theoretical possibility of heuristic determination if a sample require command arguments, or, if it accepts any. At the very least, they should hint that in their reports.

There are some command line arguments that are universal and can be guessed e.g. /? or /h. Others require a lot of reversing since program’s logic is often hidden under many layers of code and nested calls.

What kind of heuristics we can come up with?

For instance, if an API called immediately after GetCommandLine is ExitProcess then the chances are this program requires command line arguments.

If we can determine location and internal layout of WinMain or main functions and then also of an argc variable (using e.g. signatures, hooking, or emulation, or by monitoring stack), we can attempt to trace the access to this variable. When access is detected we can try to analyze code that is using the variable’s value. If our sample exits almost immediately after this comparison the program most likely is requiring command line arguments.

Other possibilities could involve:

  • monitoring of dedicated parsing routines, e.g. getopt function, but also many inline functions that are embedded in popular frameworks
  • string detection for popular arguments, e.g. /s, -embedding
  • string detection for help information, e.g.: usage:
  • detection of installer type, version (they usually accept some command line arguments that are predefined)
  • fuzzy comparison against known files (if we know sample X required command line arguments, chances are that a similar file will too)
  • ‘reverse proof’ of no CLI requirement
    • if it calls GUI functions then less likely to wait for arguments (but may still accept them)
    • if it is an installer, then we typically know how to handle it (e.g. using clickers)
    • if it is a driver – no command line arguments
    • if it is a DLL, most likely no command line processing (BUT some of the exported functions do rely on command line arguments!)
  • etc.

Overall this is a non-trivial task and there are very poor chances of offering a generic solution here, but it is a good idea to at least flag the file for manual analysis. Either in-house or in a report for client.

Total Commander Plugins & Their Automated Installation

May 12, 2019 in Clustering, File Formats ZOO, Forensic Analysis, Malware Analysis

Total Commander (TC) and its Plug-ins are floating on the internet for a veeeery long time. As such, most of what I describe below is most likely known to many people. However, as it is with everything, it’s always good to just describe stuff from a DF/IR angle, especially to ppl who never used or came across this program itself or a very specific subset of its features.

And since this post is about TC, I must say for the millionth time that if you use Windows Explorer as your goto File Manager you are hurting yourself a lot. Once you try TC, FAR, or any type of the Orthodox File Managers, there is no way back. It’s worth every single eurocent you have to pay for it. Btw. I am not paid to endorse this software, I just love it and recommend it to anyone who wants to be more efficient.

Like many other popular programs TC supports plug-ins. There are many of these, they are often pretty cool and they add a lot of extra features to this awesome program e.g. handling of additional file types, direct access to Registry, additional archiving options, etc. Some examples can be found here. The page I linked to also includes a number of .chm files (search for a ‘guide’ keyword) that describe how to build your own plug-ins.

The topic of this post is not TC or its plug-ins coding though. At least not directly.

What I want to mention is this: the plug-ins support a mechanism that author of TC calls a ‘Plugins Automated Installation’. The idea is pretty straightforward – any common archive that TC opens/sees that has the ‘pluginst.inf’ file present inside the archive will make TC recognize the file as a possible TC Plug-in. This is actually a very handy auto-install method and I personally used it a number of times in the past.

When one opens such archive in TC, the following windows may appear:


These messages come from the pluginst.inf file itself. The structure of the file is like any other standard .ini file, plus it needs these section bits to be defined:

description = description in English
description<lngcode> = description in the language identified by <lngcode>

The type determines what plug-in it is. The available types are: wcx, wfx, wlx or wdx or lng. Additionally the plug-ins may include files with a .mnu or .bar file extension + various misc. files that support its work.

Here’s a quick break down of what these file extensions are associated with:

  • WCX – New Archive formats.
  • WDX – New columns (properties) for items.
  • WLX – Lister plug-ins
  • WFX – New file systems (e.g. extX)
  • LNG – Language files
  • MNU + BAR + INI – Menu files

That’s it really.

While interaction is required to install these, it’s always better to be safe than sorry. For this reason, it is perhaps worth blocking these file extensions on the email gateway. I actually added them to my old post about file extensions of interest for exactly this reason.

It is also a handy way to use the presence of pluginst.inf inside the compressed files (note: doesn’t need to be .zip; 7z works too!) to properly identify archive file subtype.