Decrypting SHell Compiled (SHC) ELF files

In its recent blog post AhnLab described a campaign that relies on SHell Compiled (SHC) ELF files. I wanted to see if I can replicate their reverse engineering work and decrypt actual shell commands they had shared in their post. This turned out to be a bit more complicated than I thought, hence this post aiming at making it a bit easier for you.

Before I go into nitty-gritty details – when I try to crack new stuff I usually look for an existing body of work first. A really quick google search helped me to discover a tool called UnSHC that helps to decrypt SHC ELF files in a generic way. And I must digress here for a moment – it’s an amazing hack of a tool really – a shell script engaging lots of CLI tools trying to discover the inner working of the encryption via automated code analysis, find the decryption routine and then actually produce the decrypted script as an output. While it didn’t work for me, I feel kudos to the author are in order. Studying the inner working of UnShc was a pure pleasure. Thumbs up!

Coming back to the AhnLab post. I was intrigued by the Alleged RC4 encryption used by SHC and thought — okay, I just need to load it into a debugger, step through it, maybe instrument it here and there, and then look at the decrypted buffers. So, I did that, then I realized that despite walking through the code, I could only decrypt part of the encrypted data. I could decrypt the ‘internal’ strings of SHC, but not the final shell code. I did correctly guess where the encrypted shell script is, I did instrument the debugger to go there, but while trying to decrypt the encrypted blob… I was getting another binary blob that looked like garbage.

Hmm something was really fishy there.

After staring at the code of the sample in IDA, I realized there is a routine where the executable is trying to retrieve a value of a particular environment variable. After studying it a bit more, I realized that the SHC author engaged a clever trick to make reversers’ life a bit harder. When the program is executed a tuple of an environment variable and its value derived from a process ID, number of arguments (argc), the actual routine itself (probably to detect ‘int 3’ opcodes in it, if debugged) are added to the process environment. The program then calls execve on its own file. This makes the program restart with the same pid (the ELF image is overwritten in memory and restarted). And this finally leads to the execution of the aforementioned routine again, and this time the required tuple of environment variable and its value are present. Only then the decryption of the actual embedded shell script is possible. From a debugging/instrumenting perspective it’s unpleasant, so I had to quickly devise a way to bypass it.

It turned out that it’s easier than expected.

The solution: the very same ‘environment-operating’ routine can be called twice. The first time it will look for the environment variable, and it won’t find it there, so it will add it. The second time we execute it via instrumentation, the environment variable will be there. So, it will read the value of the environment variable, set appropriate inner variables, and with that in place we can decrypt the main shell script within a single process instance.

Let’s have a look at the example: 256ab7aa7b94c47ae6ad6ca8ebad7e2734ebaa21542934604eb7230143137342.

We load it into edb first, and then make a breakpoint on 0x400FDD — this is prior to executing aforementioned ‘environment variable’ tinkering procedure. Then we run the program (F9). We should get a break here:

We step over it F8, F8 and now we end up in 0x400FE5.

We then re-run the code above to make it look like we execute it as if we were in the new instance of the process. So. we go back to 0x400FDD and set the RIP to ‘here’ — right click and ‘Set RIP to this instruction’, or CTRL+*. We do F8, F8. And we are set.

All we have to do now is F8 many times until we reach 0x4011A7, at this stage point your dump window to the location rdi points to. Then execute decryption routine and you will see the decrypted shell script in the data dump window:

Not installing the installers

Looking at installers of goodware is quite boring. They do the right thing, at least most of the time, and there is not much to see there. However, if you add some scale and automation to it, you may actually find some value there. For both Red and Blue sides of the fence.

The most popular installers for Windows are Nullsoft and InnoSetup (apart from MSI). Luckily, we have good decompilers available for both of them (InnoUnp and 7z), so one wanting to explore the possibilities just needs to run these on a bunch of clean samples.

The decompilation results are interesting for many reasons.

If the installer is signed, it may execute its installation script and may bypass EDRs. I have obviously no idea if it is always the case, but if VT says it’s signed and ‘green’ by all AVs, the chances are high that whatever the sample does, it will be permitted to do so.

The opportunity this fact brings to RT is that some of installers’ actions may help to deliver some functionality that RT can abuse.

Many installers add a run key. It’s a lame use case, but one could run such installer, get all the settings in place via a trusted, signed binary, and then swap the executable referenced by the Run key with a payload of choice.

Another opportunity for RT is domain recycling. Many older installers refer to domains that no longer exist. By combing the decompiled installation scripts you may find domains that you could re-use. It is highly possible that an old, but non-existing software developer had all the green marks from web proxy/IDS/IPS, even e-mail security vendors, VT and this setting has never been updated. By recycling such domain you may get a nice way to create a ‘clean’ C2 channel, deliver phish/malspam. And if you are very very lucky, some people may be still using that old software. What if the software has an auto-update mechanism? These could form potential big bounty wins using a legacy autoupdate mechanism as a supply-chain attack .

DLL sideloading or Lolbin executable spawning via installers is also possible. Either via a clever race condition, one-off opportunities or by leveraging GUI that pauses the installer for a moment (enough time to swap files in a tmp folder). Really depends on scenario and you may not find a lot of such installers, but hey… it’s possible.

From a forensic perspective, decompilation of installation scripts gives us yet another way to discover clusters of ‘clean’ paths and file names. It can form a nice exclusion list for analysis. There is also a great opportunity to create exclusion list for process parent-child relationships — many installers are ‘told’ to run some executable at the end of the installation, or simply open a browser to navigate to a site in a default browser. Most sandboxes and EDRs are blind to it and their analysis results often include lots of unnecessary artifacts that could potentially be excluded from such reports. For example, if an analyzed sample’s decompilation script tells us the installer does open the browser, the whole chain of events that follow could be excluded from the final report.

Ever wondered what is a source of some process, services, tasks running on a system? Combing through decompiled installation scripts brings a lot of answers to this question. And even more, it provides an explanation to many command line switches we see in the process parent-child relationships. We may not know their meaning, but we may learn they are preprogrammed inside the installation scripts! Aka build a nice list of ‘good command line switches’ for specific processes.

The ‘open browser at the end of the installation or uninstall’ scenarios are very useful for us too. We can use them to detect very specific events of users installing software that is outside of the acceptable use policy. Yes, we can use EDR or asset inventory tools for that too, but what if the software is portable? Any clue of an install event is important.

Finally, you could possibly write signatures/yara definitions for installation scripts that could help to detect different version of the same software w/o a need to sandbox them.

I am sure there are more ideas out there.