Not installing the installers, part 2

In the last post I described how we can pull some interesting metadata from decompiled installers. Today I want to discuss one practical example of how this data can enrich analysis, both manual and automatic (f.ex. sandbox, EDR).

Many programs cannot be properly analyzed by sandboxes, because they require command line arguments. While command line options for native Windows OS binaries are usually well documented (well, not really, there is a lot of undocumented stuff, but let’s forget about it for a second), command line options used by goodware is a completely different story. And of course, even worse for malware.

The good news about goodware is that they handle command line arguments in a very predictable way. The string comparisons are usually ‘naive’, direct and not optimized, and often, the programs include actual help that can be seen after running the program with the /?, /help, -h, –help arguments. And very often, a search for ‘usage’ keyword inside the binary can help us to find the options that program recognizes. f.ex. this is what we see inside cscript.exe:

Predictable is good, and can serve at least a few purposes:

  • we can generate a list of known parameter strings that goodware typically uses (and even attribute it to specific software vendors)
  • we can create yara signatures for these
  • we can incorporate this set into EDR command line parsing routines to assess invocation’s similarity to known good
  • we can also leverage this to run the sample in a sandbox or manually with these easily discovered command line arguments (kinda like assisted fuzzing)
  • we can include these findings in the report itself to hint analysts they may need to do some manual reversing (it would seem the program accepts the following command line arguments…)

Looking for typical command line arguments is actually quite difficult. There are a lot of ways to implement string comparisons and as I explained long time ago in one of my sandbox series, there are like gazillion different string functions out there. Plus different compilers, different optimizations make the code even harder to comprehend. Naive search for /[a-zA-Z_0-9]_/ could work on a binary level, but this is going to hit a lot of FPs. Decompiled scripts can come to the rescue, as they include actual invocations of programs and specify precisely what parameters will be passed to the program.

The attached list focuses on a basic command line argument extraction (just the /foo part) from around 10K decompiled scripts. More advanced analysis would include options taking parameters (f.ex. /foo bar).

You can download it from here.

Not installing the installers

Looking at installers of goodware is quite boring. They do the right thing, at least most of the time, and there is not much to see there. However, if you add some scale and automation to it, you may actually find some value there. For both Red and Blue sides of the fence.

The most popular installers for Windows are Nullsoft and InnoSetup (apart from MSI). Luckily, we have good decompilers available for both of them (InnoUnp and 7z), so one wanting to explore the possibilities just needs to run these on a bunch of clean samples.

The decompilation results are interesting for many reasons.

If the installer is signed, it may execute its installation script and may bypass EDRs. I have obviously no idea if it is always the case, but if VT says it’s signed and ‘green’ by all AVs, the chances are high that whatever the sample does, it will be permitted to do so.

The opportunity this fact brings to RT is that some of installers’ actions may help to deliver some functionality that RT can abuse.

Many installers add a run key. It’s a lame use case, but one could run such installer, get all the settings in place via a trusted, signed binary, and then swap the executable referenced by the Run key with a payload of choice.

Another opportunity for RT is domain recycling. Many older installers refer to domains that no longer exist. By combing the decompiled installation scripts you may find domains that you could re-use. It is highly possible that an old, but non-existing software developer had all the green marks from web proxy/IDS/IPS, even e-mail security vendors, VT and this setting has never been updated. By recycling such domain you may get a nice way to create a ‘clean’ C2 channel, deliver phish/malspam. And if you are very very lucky, some people may be still using that old software. What if the software has an auto-update mechanism? These could form potential big bounty wins using a legacy autoupdate mechanism as a supply-chain attack .

DLL sideloading or Lolbin executable spawning via installers is also possible. Either via a clever race condition, one-off opportunities or by leveraging GUI that pauses the installer for a moment (enough time to swap files in a tmp folder). Really depends on scenario and you may not find a lot of such installers, but hey… it’s possible.

From a forensic perspective, decompilation of installation scripts gives us yet another way to discover clusters of ‘clean’ paths and file names. It can form a nice exclusion list for analysis. There is also a great opportunity to create exclusion list for process parent-child relationships — many installers are ‘told’ to run some executable at the end of the installation, or simply open a browser to navigate to a site in a default browser. Most sandboxes and EDRs are blind to it and their analysis results often include lots of unnecessary artifacts that could potentially be excluded from such reports. For example, if an analyzed sample’s decompilation script tells us the installer does open the browser, the whole chain of events that follow could be excluded from the final report.

Ever wondered what is a source of some process, services, tasks running on a system? Combing through decompiled installation scripts brings a lot of answers to this question. And even more, it provides an explanation to many command line switches we see in the process parent-child relationships. We may not know their meaning, but we may learn they are preprogrammed inside the installation scripts! Aka build a nice list of ‘good command line switches’ for specific processes.

The ‘open browser at the end of the installation or uninstall’ scenarios are very useful for us too. We can use them to detect very specific events of users installing software that is outside of the acceptable use policy. Yes, we can use EDR or asset inventory tools for that too, but what if the software is portable? Any clue of an install event is important.

Finally, you could possibly write signatures/yara definitions for installation scripts that could help to detect different version of the same software w/o a need to sandbox them.

I am sure there are more ideas out there.